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I.  INTRODUCTION 


A.  BACKGROUND 

The  consumer  retail  marketplace  in  the  United  States  has  become  increasingly 
more  competitive  in  recent  years  due  to  increase  communications,  online  shopping 
opportunities,  a  new  world  order  post  “9-11”,  and  an  on-demand  mentality.  The  proper 
location  of  a  company’s  store  and  allocation  of  goods  can  assist  or  impede  store 
performance.  Optimizing  these  factors  should  be  of  utmost  importance  for  real  estate  site 
selection.  This  concept  applies  to  private  industry  and  to  the  Department  of  Defense. 

Numerous  methodologies  are  used  in  the  private  sector  to  choose  real  estate  for 
business  opportunities.  However,  many  companies  do  not  optimize  their  selections  while 
utilizing  spatial  decision  support  systems  (SDSS)  to  assist  in  their  real  estate  planning. 
This  is  true  for  the  retail,  restaurant,  banking,  and  any  other  industry  where  store  location 
can  drive  sales.  External  factors  will  affect  sales  based  on  location.  These  factors  can  be; 
but  are  not  limited  to;  location,  competition,  population,  and  site  characteristics. 

Military  retail  facilities  are  currently  aligned  with  operational  military 
installations.  The  methodologies  utilized  in  the  private  sector  can  be  applied  to  military 
retail  facilities.  The  external  factors  are  relevant  for  both.  An  emerging  approach  for 
managing  external  factors  to  optimize  site  selection  utilizes  artificial  intelligent 
algorithms.  Artificial  intelligent  algorithms,  SDSS,  and  geospatial  information  systems 
(GIS)  software  have  gained  popularity  for  real  estate  site  selection  in  recent  years.  This 
project  looks  into  possible  advantages  of  applying  artificial  intelligent  models  to  model 
consumer  behavior  in  an  effort  to  prevent  opening  non-profitable  stores  and  optimize 
locations. 

B.  RESEARCH  QUESTIONS 

This  project  is  designed  to  answer  basic  questions  concerning  optimizing  real 
estate  site  selection.  Many  companies  do  utilize  computer  models  to  assist  in  their  real 
estate  site  selection  process,  however  optimizing  these  results  in  a  market  requires  further 
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modeling  techniques.  The  use  of  computer  models  utilizing  artificial  intelligent 
algorithms  to  optimize  the  site  selection  problem  will  be  shown  as  an  optimal  option  for 
this  process. 

1.  Primary  Questions 

The  primary  focus  of  this  project  is  to  answer  the  questions: 

•  Is  there  a  current  need  for  market  planning  modeling?  If  so,  why? 

•  To  what  extent  do  artificial  intelligent  algorithms  improve  the  market 
planning  process? 

•  Can  artificial  intelligent  algorithms  be  applied  successfully  to  market 
planning? 

2.  Secondary  Question 

In  order  to  fully  answer  the  primary  questions,  a  secondary  question  needs  to  be 
answered: 

•  What  are  the  limitations  to  current  market  planning  models? 


C.  PROJECT  BENEFITS 

This  project  provides  insight  into  the  real  estate  site  selection  process  and  its 
complexities.  Previously  real  estate  site  selection  was  a  “best-guess  kind  of  game” 
(Chittum,  2005).  This  is  not  necessarily  the  case  anymore.  Current  techniques  in  real 
estate  site  selection  will  be  discussed  showing  their  advantages  in  predicting  sales. 
Improved  accuracy  in  assessing  a  location’s  potential  sales  along  with  its  interaction  with 
current  stores  can  be  vital  in  selecting  sites.  However,  there  are  limitations  to  current  site 
selection  methodologies.  In  order  to  assess  sales  across  an  entire  trade  area  or  market, 
further  optimization  needs  to  be  introduced.  This  optimization  can  come  from  the  use  of 
artificial  intelligent  algorithms.  These  algorithms  can  be  applied  across  many  industries 
such  as  the  retail,  restaurant,  and  banking  industries. 
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The  results  of  this  project  can  be  directly  applied  to  military  retail  facilities 
(exchanges  and  commissaries).  Optimizing  the  location  and  allocation  of  goods  and 
services  through  artificial  intelligent  algorithms  can  provide  previously  unrealized  cost 
savings  to  the  Department  of  Defense. 


D.  METHODOLOGY 

An  extensive  literature  review  of  relevant  business,  real  estate,  and  technical 
topics  was  conducted.  Internet  sites,  testimonies,  magazine  and  journal  articles,  among 
others  were  reviewed.  In  order  to  understand  current  real  estate  site  selection  techniques, 
a  historical  context  is  provided.  The  origin  of  site  selection  methodologies  was  important 
to  understand  in  order  to  show  the  evolution  of  the  site  selection  process. 

In  order  to  optimize  sales  for  a  specific  trade  area,  optimization  techniques  need 
to  be  utilized.  A  review  of  artificial  intelligent  algorithms  as  a  form  of  optimization  was 
reviewed  as  an  option.  The  evolution  of  artificial  intelligent  algorithms  is  provided  in 
order  to  show  why  they  are  just  now  becoming  a  viable  option  for  optimization. 
Ultimately  genetic  algorithms  are  shown  in  theory  to  be  the  optimal  artificial  intelligent 
algorithm  for  optimization  purposes. 


E.  PROJECT  ORGANIZATION 

This  project  is  organized  into  five  chapters.  Chapter  I  provides  an  introduction  to 
the  objectives  of  this  project.  Chapter  II  provides  a  literature  review  highlighting  the 
concepts  of  real  estate  site  selection  and  the  applications  of  emerging  technologies  in  the 
real  estate  site  selection  process.  This  chapter  also  provides  a  look  into  the  origins  of  real 
estate  site  selection  in  business.  Understanding  the  need  for  site  selection  analysis  led  to 
the  research  of  optimization  techniques  and  methods  of  delivery.  This  portion  of  the 
chapter  discusses  how  businesses  utilize  these  methodologies  today. 

Chapter  III  examines  the  real  estate  site  selection  processes  currently  used  in  the 
private  sector.  Four  main  methods  of  site  selection  analysis  are  presented  in  depth. 
These  methods  are  examined  for  strengths  and  weakness  in  the  approach  to  providing 
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accurate  results  in  operation.  Real-world  examples  of  relevant  methods  are  provided  to 
show  the  complexities  of  this  analysis  and  the  results  they  can  afford. 

Chapter  IV  examines  the  application  of  artificial  intelligent  algorithms  for 
optimizing  the  site  selection  process.  Market  planning  techniques  are  examined  along 
with  the  use  of  artificial  intelligent  algorithms  in  this  process.  Site  selection  models 
alone  cannot  analyze  and  entire  market  on  their  own.  Models  will  look  at  one  store  and 
its  potential  interaction  in  the  market.  However  they  will  not  take  into  account  how  all 
stores  and  locations  will  interact  with  each  other.  In  order  to  do  this,  further  optimization 
needs  to  be  conducted.  A  comparison  to  traditional  real  estate  site  selection  techniques  is 
examined  showing  actual  and  potential  improvements  in  this  process.  Real-world 
examples  are  provided  to  show  actual  effectiveness  and  improvement  from  use  of 
artificial  intelligent  algorithms  in  practice. 

Chapter  V  summarizes  the  results  of  this  project.  Limitations  and  conclusions  are 
discussed.  Applying  the  results  of  this  project  to  military  retail  facilities  is  shown  as  a 
possible  application  of  optimizing  real  estate  site  selection  techniques. 


4 


II.  LITERATURE  REVIEW 


A.  INTRODUCTION 

This  chapter  provides  background  information  on  a  number  of  subject  areas  in 
order  to  lay  the  foundation  for  topics  raised  throughout  the  remainder  of  this  project.  As 
a  result  of  this  literature  review,  a  context  is  created  for  the  analysis  of  the  real  estate  site 
selection  process.  A  main  goal  of  the  real  estate  site  selection  process  is  to  optimize  the 
number  and  location  of  sites  within  a  market  to  provide  maximum  profit  based  upon 
consumer  actions  and  available  real  estate.  Basically,  this  process  will  model  consumer 
behavior  within  a  market  related  to  current  and  potentially  available  real  estate  locations. 
Ultimately  there  will  be  an  optimal  mix  of  sites  that  will  provide  the  maximum  profit  for 
the  company  based  on  these  factors.  These  factors  can  now  be  managed  through 
emerging  technologies  such  as  SDSS  software  and  artificial  intelligent  algorithms.  The 
following  subject  matter  will  help  frame  business,  management,  and  technical  concepts 
that  can  be  applied  to  the  real  estate  site  selection  process. 

This  chapter  is  broken  into  two  main  portions:  real  estate  site  selection  process 
and  optimization  using  artificial  intelligence.  First  the  theory  of  real  estate  site  selection 
is  discussed.  Studies  detail  the  root  of  the  process  by  discussing  the  multitude  of  factors 
involved.  Industry  experts  highlight  specific  issues  and  methods  that  affect  their  site 
selection  process.  These  methods  are  broken  into  four  main  approaches. 

A  discussion  on  the  use  of  artificial  intelligent  algorithms  is  presented  next. 
Artificial  intelligence,  specifically  genetic  algorithms,  represents  an  emerging  fonn  of 
site  selection  optimization.  Looking  at  one  or  a  few  sites  in  isolation  is  not  sufficient  to 
account  for  the  interactions  between  sites  and  consumer  actions.  Optimization  is  needed 
to  take  into  account  all  factors  that  could  affect  the  profit  for  a  given  market.  This  section 
further  explains  how  artificial  intelligence  can  improve  site  selection  beyond  traditional 
forms  of  analysis. 
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B.  REAL  ESTATE  SITE  SELECTION  PROCESS  OVERVIEW 

Choosing  a  site  to  open  a  new  store  or  business  can  be  related  to  the  success  or 
failure  of  that  store  or  business.  Location  and  allocation  of  goods  are  essential 
components  to  a  successful  new  store.  This  section  focuses  on  research  done  on  the  real 
estate  site  selection  process,  current  processes  currently  utilized,  and  the  affected 
industries. 

1.  Real  Estate  Site  Selection  History 

Lea  (1989)  describes  the  history  of  real  estate  site  selection  by  breaking  out  three 
distinct  periods  leading  to  modern  site  selection  techniques:  the  beginnings  of  retail 
theory  via  neoclassical  theories  (1870-1950),  the  renewal  in  site  selection  with 
quantitative  analysis  (1950-1985),  and  modem  techniques  using  GIS-based  analysis 
(198 5 -present).  Each  of  these  periods  represents  a  shift  in  methodologies  of  where 
commercial  real  estate  would  be  placed.  This  specifically  affected  retail  companies. 

a.  Neoclassical  Theories  (1870-1950) 

The  end  of  the  Victorian  Period  marked  the  first  time  in  North  American 
history  where  products  were  mass-produced  for  customers.  This  production  allowed 
many  smaller  stores  to  open  providing  goods  from  outside  the  local  area.  In  the  1880’s 
larger  distribution  centers  were  appearing  in  order  to  supply  these  more  remote  stores 
(Kates,  1997).  Locations  were  being  chosen  for  stores  based  upon  population,  but  the 
competitive  nature  of  the  retail  industry  was  not  apparent. 

Larger  stores  and  smaller  stores  were  able  to  coexist  in  this  time  without 
competitive  interaction.  William  Reilly  developed  “Reilly’s  Law”  in  1931.  This  law 
stated  that  customers  located  between  two  cities  would  be  drawn  to  the  larger  city  for 
retail  purposes.  This  applied  Newtonian  physics  to  consumer  behavior.  The 
attractiveness  of  the  location  was  related  to  the  distance  between  the  consumer’s 
residences.  This  idea  drove  many  company’s  real  estate  locations  during  this  time.  It 
wasn’t  until  The  Great  Depression  and  the  two  World  Wars  that  competition  truly 
impacted  the  face  of  real  estate  site  selection  (Kates,  1997).  As  many  goods  and  services 
were  depleted,  store  location  was  becoming  more  important. 
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b.  Quantitative  Analysis  (1950-1985) 

The  post-WWII  era  in  North  America  represented  a  time  of  excess  and 
exodus.  The  war  was  over  and  people  were  again  spending  money.  People  wanted  to 
forget  about  the  troubles  of  previous  years  and  invest  in  a  new  future.  Thompson  (1966) 
adds  that  retail  marketing  became  increasingly  important  in  this  period.  As  men  and 
women  exited  their  wartime  occupations  there  were  many  geographers  that  were  hired  on 
by  retail  companies  to  perfonn  business  functions.  This  coupled  with  a  quantitative 
explosion  helped  produce  a  new  era  in  real  estate  site  selection  (Kates,  1997). 

Geographers  working  in  the  retail  industry,  known  as  marketing 
geographers,  started  looking  at  the  retail  environment  in  new  ways.  Quantitative  and 
qualitative  models  were  produced  to  better  describe  consumer  actions  and  store  locations. 
One  of  the  earliest  methods  for  real  estate  site  selection  is  the  Checklist  Method  (Nelson, 
1958).  This  method  uses  a  standardized  list  of  principles  to  rank  order  current  and 
potential  sites.  This  method  is  more  qualitative,  but  it  does  show  an  important  step 
towards  providing  thought  and  order  in  the  site  selection  process. 

Applebaum  (1966)  provided  the  next  significant  model  to  be  used  in  the 
real  estate  site  selection  process:  The  Analogue  Model.  This  model  uses  analogous  sites 
to  help  determine  if  a  potential  site  will  do  well  in  a  given  market.  After  finding  a 
sufficient  number  of  analogous  sites  based  upon  location,  demographics,  and/or  other 
factors  that  will  possibly  affect  sales;  a  benchmark  is  developed  to  describe  the  potential 
site.  This  method  is  still  used  today  and  will  be  described  in-depth  later  within  this 
project. 

Spatial-interaction  (or  gravity)  models  were  also  introduced  in  this  period. 
These  models  provided  more  advanced  quantitative  methods  for  describing  the  trade  area 
based  on  consumer  actions  and  available  real  estate.  Huff  (1964)  can  be  attributed  to  the 
advent  of  spatial-interaction  modeling  and  the  application  to  real  estate  site  selection.  He 
looked  at  how  consumers  visited  different  shopping  areas  and  suggested  that  the  “utility 
of  a  store  depended  on  the  size  of  the  shopping  center,  travel  time,  and  a  parameter  that 


7 


reflects  the  effect  of  travel  time  on  various  kinds  of  shopping  trips”  (Kates,  1997).  This 
model  will  also  be  examined  further  within  this  project. 

c.  GIS-based  Analysis  (1 985-present) 

The  Quantitative  Analysis  period  provided  many  methods  for  modeling 
consumer  behavior;  however  there  was  not  a  way  to  incorporate  many  of  these 
applications  into  a  centralized  tool  until  the  advent  of  computers  in  the  workplace.  As 
computers  became  more  available  and  computing  speeds  increased,  the  methods  and 
models  that  had  been  previous  developed  could  be  more  readily  applied  in  real  estate  site 
selection.  GIS  systems  took  these  models  and  allowed  analysts  to  apply  them  to  real 
estate  site  selection  quickly  and  easily. 

GIS  systems  consist  of  computer  hardware,  software,  and  other 
peripherals  that  can  transfonn  spatially-referenced  infonnation  into  visually  useful 
outputs  that  can  be  manipulated  as  needed  (Castle,  1993).  Castle  points  out  that  the  main 
advantages  of  using  GIS  systems  include: 

•  Data  acquisition,  input,  and  editing; 

•  Database  management; 

•  Query  and  retrieval; 

•  Data  analysis,  modeling,  synthesis;  and 

•  Display,  output,  and  dissemination  of  data  and  information. 

Now  analysts  could  incorporate  data  and  infonnation  from  multiple  sources  quicker  than 
previously  available.  Less  dependence  on  qualitative  and  artistic  type  models  was 
required  for  real  estate  site  selection. 

2.  Real  Estate  Site  Selection  Process 

As  previously  shown,  the  real  estate  site  selection  process  has  evolved  over  the 
past  few  decades.  John  Dawson,  chief  development  officer  for  Dunkin’  Brands,  states 
that  “Years  ago,  guys  like  myself  did  this  on  gut  feelings”  (Chittum,  2005).  Site  selection 
used  to  be  a  best-guess  kind  of  a  game.  However  recent  technological  advances, 
specifically  in  GIS  software,  have  allowed  companies  and  real  estate  executives  to 
assemble  data  more  easily  in  order  to  make  informed  decisions  (Chittum,  2005). 

Bergeron  (2005)  provides  a  model  for  the  site  selection  process.  Figure  1  shows 

this  process  from  recognition  of  company  needs  to  the  ultimate  deliverables.  This 
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establishes  a  framework  to  understand  the  thought  process  of  evolving  real  estate  site 
selection  techniques.  For  the  purpose  of  this  project,  “Plan  the  Work”  and  “Work  the 
Plan”  will  be  looked  at  in  greater  detail  (Bergeron,  2005).  The  operational  site  selection 
techniques  to  be  discussed  fall  into  these  categories. 
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Strategy,  2005. 


Figure  1.  Site  Selection  Process 


Real  estate  site  selection  techniques  can  be  divided  into  four  main  methods:  the 
checklist  method,  analog  method,  regression  modeling,  and  spatial  interaction  (gravity) 
modeling.  Ah  four  methods  have  strengths  and  weaknesses  and  can  be  seen  in  industry 
today.  The  checklist  method  uses  a  checklist  of  factors  (O’Malley  et  al.,  1995).  These 
factors  generally  managed  without  computer  programs  and  provide  a  more  artistic 
approach  to  site  selection  due  to  human  interaction  in  the  process.  The  analog  method 
bases  projections  and  comparison  on  similar  stores,  while  regression  modeling  utilizes 
advanced  computer  models  (O’Malley  et  al.,  1995).  Lee  and  Pace  (2005)  describe 
gravity  modeling  as  “spatial  dependencies  among  both  consumers  and  retailers.  The 
results  show  that  both  forms  of  spatial  dependence  exert  statistically  and  economically 
significant  impacts  on  the  estimates  of  parameters”  from  the  model. 


9 


3. 


Industries  Affected 


Multiple  industries  can  be  affected  by  improved  real  estate  site  selection 
techniques.  Fryrear,  Prill,  and  Worzala  (2001)  showed  that  the  following  industries  were 
utilizing  geographic  information  to  enhance  their  real  estate  site  selections: 

•  Retail 

•  Real  estate  development 

•  Government 

•  Property  management 

•  Professional  management 

•  Public  utilities 

•  Warehousing,  distribution 

•  Mini-storage 

•  Healthcare 

•  Banking 

Of  these  industries,  retail  was  the  predominant  industry  shown  to  utilize  geographic  data 
for  site  selection  (Fryrear  et  al.,  2001).  Other  industries  can  also  benefit  from  site 
selection  techniques.  The  restaurant  industry,  specifically  chain  restaurants,  tends  to  take 
a  scientific  approach  when  scouting  prospective  locations  (Perlik,  2004).  Also,  service 
industry  companies  like  quick-lube  car  centers  can  rely  on  basic  factors  to  increase  their 
chances  for  success.  Educational  institutions  have  even  utilized  site  selection  techniques 
when  selecting  new  campuses  (Alt,  1967).  Three  specific  factors  that  these  companies 
look  at  are  “proximity  of  both  residential  and  work  areas,  consumers  who  have 
discretionary  income,  and  the  presence  of  a  ‘retail  cluster’”  (Bennett,  2003). 

As  more  companies  within  an  industry  enter  the  market,  competition  increases. 
Improved  location  can  help  optimize  the  success  of  a  location  and  deter  from  opening 
suboptimal  locations.  Muller  and  Inman  (1994)  show  that  the  “challenge  is  to  identify 
those  factors  that  will  yield  the  most  accurate  predictions.  This  is  where 
geodemographics  and  geographic  information  system  (GIS)  software  play  a  role.”  The 
basic  principles  of  choosing  an  optimal  site  transcend  many  industries.  There  are  no 
limits  to  the  application  of  these  principles.  So  long  as  a  company  wishes  to  open  a  store 
in  an  ideal  location,  a  site  selection  process  will  be  utilized.  The  specific  process  utilized 
will  be  discussed  later  in  this  project. 
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4.  Company  Interviews 

Speaking  to  companies  who  are  expanding  and/or  decreasing  their  business  and 
locations  was  essential  for  the  project.  Understanding  how  companies  in  the 
aforementioned  industries  chose  sites  helped  frame  the  issues  associated  with  site 
selection.  Companies  in  the  restaurant,  retail,  and  grocery  industries  were  interviewed. 
Although  many  companies  did  not  divulge  corporate  information  for  this  project,  the  few 
that  did  partake  provided  a  basis  for  understand  current  practices. 

Almost  all  companies  maintained  a  real  estate  division,  department,  or  team.  The 
ultimate  decision  authority  usually  came  from  a  subjective  manager.  All  but  one  of  the 
companies  utilized  a  fonn  of  modeling  to  assist  in  their  site  selection  process.  However, 
only  two  of  the  companies  utilized  optimization  techniques.  Artificial  intelligent 
algorithms  were  only  utilized  in  one  of  these  instances. 


C.  OPTIMIZATION  USING  ARTIFICIAL  INTELLIGENCE 

Looking  at  a  potential  site  in  isolation  does  not  provide  real  estate  analysts 
information  about  the  market  as  a  whole.  Interactions  between  sites  need  to  be  examined 
when  looking  at  an  entire  market.  Also,  consumer  actions  need  to  be  accounted  for.  This 
is  why  optimization  is  an  important  factor  in  real  estate  site  selection. 

Optimizing  utilizing  genetic  algorithms  for  real  estate  site  selection  is  a  relatively 
new  concept.  Genetic  algorithms  first  became  a  viable  option  in  search  strategies  due  to 
the  work  of  Holland  in  1975  at  the  University  of  Michigan.  He  was  the  first  person  to 
apply  the  concepts  of  biological  evolution  into  computational  algorithms  using  the  binary 
coding  digits  of  0  and  1 .  Holland  was  able  to  imitate  the  evolutional  process  of  natural 
selection  within  a  search  system  by  using  multiple  artificially  generated  encoding  and 
selection  strategies.  This  was  the  first  time  that  an  artificially  intelligent  optimization 
technique  was  used  in  theory  (Kim,  2001).  The  procedures  for  encoding  genetic 
algorithms  will  be  discussed  in  further  detail  later  this  project. 

Genetic  algorithms  have  been  used  since  their  development  in  many  applications 
throughout  the  years.  However,  Goodchild’s  application  of  genetic  algorithms  to 
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location-allocation  problems  in  1986  marked  the  first  time  this  form  of  optimization  was 
used  in  a  real-estate  setting  (Kim,  2001).  Unfortunately  he  was  unable  to  show  that 
genetic  algorithms  were  an  improved  form  of  optimization.  He  hypothesized  that  the  role 
of  supercomputer  technologies  would  enable  genetic  algorithms  to  be  the  optimal 
optimization  technique  in  the  future  (Hosage  and  Goodchild,  1986). 

As  time  progressed  genetic  algorithms  as  a  viable  form  of  optimization  continued 
to  be  researched.  Densham  (1991)  continued  Goodchild’s  research  by  trying  to 
implement  strategies  for  solving  large  location-based  problems  using  genetic  algorithms. 
He  was  ultimately  able  to  show  that  by  pre-processing  data,  genetic  algorithms  could  be 
used  as  a  good  optimization  technique.  However,  even  by  pre-processing  data,  the  time 
needed  to  analyze  data  was  unrealistic.  The  trade-off  for  improved  results  utilizing 
genetic  algorithms  was  not  necessarily  worth  the  added  time  to  get  the  results. 

In  the  late  1990’s  great  improvements  were  made  in  computing  speeds.  At  this 
time  the  cost  for  higher  computing  power  also  came  down.  It  was  now  more  affordable 
to  purchase  the  computing  power  necessary  to  analyze  large  location-based  problems 
using  genetic  algorithms.  Hurley,  Moutinho,  and  Stephens  (1995)  along  with  Houck, 
Joines,  and  Kay  (1996)  showed  improvements  while  using  genetic  algorithms.  The 
location-allocation  problem  was  now  being  optimized  using  these  techniques.  This 
problem  dealt  with  the  optimization  of  goods  at  specific  locations.  For  example,  if  a 
company  wanted  to  optimize  its  products  throughout  its  existing  stores,  this  would  be  a 
useful  tool.  This  marked  the  first  time  genetic  algorithms  were  effectively  being  utilized. 

This  project  deals  with  utilizing  genetic  algorithms  for  real  estate  site  selection. 
Optimizing  consumer  behavior  and  available  real  estate  locations  is  a  new  concept  for  the 
application  of  genetic  algorithms.  Felicity  George  (1994)  does  provide  an  example  of 
how  genetic  algorithms  helped  optimize  car  dealerships  in  England.  Her  results  showed 
the  advantage  of  genetic  algorithms  for  optimization  specifically  for  larger  inputs.  The 
idea  of  using  this  form  of  optimization  in  retail  settings  is  not  widely  documented.  The 
real  estate  site  selection  problem  is  also  more  complex  than  the  location-allocation 
problem.  It  could  be  next  step  in  the  evolution  of  using  genetic  algorithms  in  industry  for 
optimization. 
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D. 


CHAPTER  SUMMARY 


This  chapter  provided  background  information  to  lay  the  foundation  for  the  issues 
raised  throughout  the  remainder  of  this  project.  With  respect  to  the  broad  issue  of  real 
estate  site  selection,  it  was  shown  to  be  a  highly  complex  matter.  As  the  issue  was 
explored  it  was  seen  to  be  an  area  that  is  managed  differently  in  various  industries  and 
companies  within  the  private  sector.  A  multitude  of  factors  affect  the  idealized  outcomes 
that  a  company  may  be  seeking.  Such  a  mix  of  factors  requires  careful  examination  and 
optimization  in  order  to  achieve  the  desired  effects.  The  use  of  artificial  intelligent 
algorithms  to  manage  factors  and  optimize  results  in  site  selection  is  prime  solution.  Due 
to  the  decreasing  cost  of  computing  and  increased  knowledge  of  artificial  intelligence  in 
GIS,  these  artificial  intelligent  algorithms  can  minimize  negative  site  selection  for 
companies. 
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III.  REAL  ESTATE  SITE  SELECTION  ANALYSIS 


A.  INTRODUCTION 

Over  the  past  decade,  corporate  real  estate  management  has  undergone  a 
dramatic  change  as  it  has  matured  into  a  distinct  discipline.  As  a  part  of 
the  maturing  process,  the  discipline  has  moved  beyond  its  focus  on  basic 
real  estate  services.  (Rabianski,  DeLisle  &  Carn,  2001). 

Rabianski  et  al.  (2001)  discuss  how  choosing  an  optimal  site  for  a  store  or 
company  is  emerging  into  a  complex  field  of  study.  Companies  are  no  longer  utilizing 
sites  of  opportunity  and  instinct  to  choose  new  sites.  Instead,  companies  are  relying  on 
more  scientific  means  of  site  selection. 

In  the  retail  industry,  retailers  are  “paying  greater  attention  to  making  sure  that 
their  stores  are  in  the  right  places  at  the  right  times”  (Buss,  2002).  In  order  to  make  this 
happen,  the  real  estate  market  needs  to  be  analyzed.  However,  the  analysis  must  not  stop 
at  that  point.  A  more  in-depth  process  needs  to  be  utilized  to  ensure  optimized  results. 


B.  REAL  ESTATE  SITE  SELECTION  PROCESS 

Bergeron  (2005)  provided  a  site  selection  process  in  Figure  1  (as  seen  in  the 
previous  chapter)  detailing  steps  needed  for  optimal  site  selection.  This  project  will  focus 
on  the  “Plan  the  Work”  and  “Work  the  Plan”  sections  of  this  process.  If  a  company 
desires  to  expand  in  a  market  or  optimize  its  profit,  site  selection  may  need  to  be 
examined  to  open  or  close  stores. 

Many  companies  recognize  their  organization  needs  to  choose  potential  sites 
carefully  in  order  to  allow  for  maximum  profit.  This  falls  into  Bergeron’s  initial  step  of 
“Recognize  Organizational  Needs.”  If  a  company  wants  to  expand  its  business,  it  must 
recognize  what  is  needed  in  order  to  expand.  Locating  new  areas  for  expansion  is  an 
example  of  one  of  these  needs.  Many  times  this  includes  expanding  store  or  branch 
locations. 
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Her  next  step,  “Create  the  Charter,”  shows  that  the  company  will  operate  under  a 
centralized  vision  or  scope.  This  means  that  site  selection  for  potential  locations  should 
fall  in  line  with  the  company’s  goals  and  mission.  If  the  company’s  vision  were  to 
become  the  top  widget  seller  in  the  northeast,  it  would  not  make  sense  to  look  for  new 
locations  in  the  far  west. 

Bergeron’s  next  step  outlined  is  “Assemble  the  Team.”  Generally  a  company  will 
have  a  real  estate  department  or  team  that  will  analyze  potential  sites.  They  will  assist 
management  in  making  informed  decisions  prior  to  expanding  in  untapped  markets  or 
reassessing  underperforming  markets.  Depending  on  the  available  talent  and  goals  of  the 
company,  different  teams  may  be  assembled.  Larger  companies  may  have  the 
availability  to  assemble  a  large  department  that  is  capable  of  conducting  more  advanced 
methods  of  site  selection.  This  leads  into  her  next  step  of  “Plan  the  Work.” 

Planning  the  work  that  needs  to  be  done  for  real  estate  site  selection  can  depend 
on  the  scope  of  the  vision  and  the  process  to  be  utilized.  A  company  that  wants  to  grow 
nationally  will  most  likely  be  examining  trade  areas  across  the  country.  However, 
smaller  companies  may  be  concentrating  on  regionalized  growth.  Either  way,  the  process 
for  site  selection  may  be  the  same.  Company  real  estate  departments,  and  the  analysts 
that  work  in  them,  must  plan  ahead  for  the  process  to  be  utilized.  Four  processes  for  site 
selection  will  be  discussed  in  detail  within  this  chapter. 

Bergeron  (2005)  also  provides  a  site  selection  model  in  Figure  2  showing  the 
factors  that  need  to  be  evaluated  within  the  process.  Four  key  factors  are  shown  that  need 
to  be  managed  in  order  to  ultimately  provide  a  decision  that  either  “satisfies  business 
needs”  or  “meets  selection  criteria”:  Geography  &  Culture,  Environment,  Costs  &  ROI, 
and  Workforce. 
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Source:  Carol  Bergeron’s  Making  site  selection  decisions  in  the  worldwide  economy  in  The  Handbook  of  Business 

Strategy,  2005. 


Figure  2.  Site  Selection  Model 


Geography  and  culture  deal  with  the  scope  of  the  company’s  planned  growth.  As 
stated  earlier,  a  company  may  only  wish  to  grow  locally.  It  would  not  want  to  be  looking 
into  national  markets  in  areas  that  do  not  coincide  with  the  company’s  goals  and  vision. 
Also,  cultural  factors  may  influence  the  company’s  decisions.  Many  companies  provide 
products  or  services  that  do  not  fit  in  all  cultures.  A  bank  that  provides  services  primarily 
to  military  members  may  find  that  its  company  culture  does  better  located  near  military 
installations.  Opening  a  branch  in  an  area  without  a  large  military  presence  would  not 
necessarily  be  the  best  business  decision. 

Environmental  factors  come  from  the  area  that  the  company  wishes  to  expand.  If 
the  local  economy  of  a  trade  area  is  currently  in  a  recession,  site  expansion  in  that  area 
may  not  be  the  best  choice.  Also,  if  there  stores  within  a  trade  area  are  underperforming 
due  to  environmental  factors  such  as  politics  or  legal  issues,  the  company  might  want  to 
examine  the  potential  implications  of  closing  stores. 

Costs  and  ROI  (Return  on  Investment)  can  directly  impact  the  decision  to  open  or 

close  stores  in  a  market.  The  environment  of  a  specific  trade  area  can  also  drive  them. 

Opening  stores  requires  money  to  help  that  store  open.  Construction,  new  product,  and 
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workforce  costs  can  dictate  whether  or  not  it  makes  sense  to  open  in  a  new  market.  The 
environment  can  impact  these  as  well.  Labor  costs  in  New  York  City  are  higher  than 
those  in  Houston,  TX.  These  costs  may  drive  new  store  size  and  product  placement  as 
well. 

The  final  group  of  factors  that  Bergeron  discusses  is  workforce  related.  The 
workforce  available  in  a  specific  area  may  not  consist  of  the  desired  quality.  This 
workforce  also  may  demand  higher  salaries  as  compared  to  other  locations  depending  on 
the  environment.  Labor  can  cause  its  own  problems  when  looking  at  the  real  estate  site 
selection  process.  A  site  may  be  evaluated  as  high  potential  for  sales  from  a  consumer 
standpoint,  however  workforce  may  dictate  otherwise  with  employment  legalities  and 
issues.  An  example  of  this  could  be  a  situation  where  a  retail  company  would  like  to 
open  stores  in  an  urban  area  such  as  San  Francisco  or  New  York  City.  These  two  cities 
traditionally  have  extremely  high  costs  of  living.  The  retail  company  may  not  be  willing 
to  pay  its  potential  employees  higher  wages  to  match  standard  rates  in  the  city.  This 
would  cause  potential  problems  for  the  company. 

Managing  these  four  factors  can  be  a  mentally  taxing  task.  Companies  must  look 
at  how  they  wish  to  manage  these  factors  prior  to  conducting  analysis.  Generally  data  is 
collected  into  databases  in  order  to  be  analyzed  by  corporate  real  estate  departments. 
Buckner  (1998)  states  “the  information  that  is  used  in  developing  a  database  (and  from 
what  source  it  is  derived)  is  a  function  of  how  quickly  a  decision  needs  to  be  made,  the 
level  of  accuracy  required,  as  well  as  developmental  cost  considerations.”  For  example,  a 
short-term  decision  could  be  desired  on  whether  or  not  to  open  a  retail  site  in  a  new  trade 
area.  Time  may  not  allow  for  the  acquisition  of  all  necessary  data  and  for  a 
comprehensive  analysis.  Instead,  the  real  estate  team  may  decide  to  run  a  less 
comprehensive  analysis.  Current  analysis  techniques  can  range  from  a  gut-instinct 
decision  to  highly  complex  algorithms.  The  company  must  establish  their  site  selection 
techniques  early  in  the  site  selection  process  prior  acquiring  data.  Four  main  site 
selections  techniques  will  be  discussed  in  the  following  sections. 


18 


C.  REAL  ESTATE  SITE  SELECTION  TECHNIQUES 

As  previously  mentioned,  multiple  factors  are  managed  within  the  real  estate  site 
selection  process.  In  order  to  manage  these  factors,  modern  standard  site  selection 
techniques  are  utilized  in  theory  and  practice.  O’Malley  et  al.  (1995)  provides  a 
preliminary  discussion  on  site  selection  techniques.  They  outline  the  checklist  method, 
analog  method,  and  regression  modeling.  These  techniques  currently  are  being  utilized, 
however  another  selection  technique  has  emerged  from  the  work  of  Professor  David  Huff 
in  the  1960’s:  spatial  interaction  modeling,  sometimes  referred  to  as  gravity  modeling. 
Figure  3  displays  the  various  techniques  utilized  for  real  estate  site  selection. 


Figure  3.  Site  Selection  Approaches 

Although  there  are  multiple  methods  for  site  location  analysis,  none  of  the 
methods  are  able  to  optimize  multiple  locations  within  a  market  to  provide  maximum 
profit.  In  order  to  provide  optimized  ROI,  further  optimization  techniques  need  to  be 
utilized.  These  will  be  discussed  in  Chapter  IV.  In  order  to  gain  a  full  understanding  of 
current  site  location  methods  utilized  in  practice,  all  methods  will  be  discussed  in  this 
chapter.  However,  the  analog  method  and  regression  modeling  will  be  studied  further  for 
optimization  purposes. 

1.  The  Checklist  Method 

The  Checklist  method  utilizes  a  basic  approach  to  site  selection. 
Geodemographics,  combined  from  both  geographic  and  demographic  information  on 
populations,  are  treated  as  checklist  of  factors  (O’Malley  et  al.,  1995).  The  analyst 
simply  checks  off  the  information  from  a  specific  site.  The  more  checks  a  specific  site 
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gains;  the  more  likely  the  site  satisfies  initial  search  criteria.  Being  that  this  method 
solely  lists  out  parameters,  quantitative  results  are  not  easily  shown.  Optimization  of  site 
selection  criteria  is  not  available. 

Lilien  and  Kotler  (1983)  provide  a  useful  example  of  this  method  by  utilizing 
eight  major  site  parameters.  Each  of  these  parameters  is  then  subdivided  into  several 
attributes.  Analysts  check  each  parameter  separately  for  the  specified  location  and 
develop  a  list  of  strengths  and  weaknesses  of  all  locations.  The  list  compares  all  the 
locations  using  empirical  rules  and  weights  (Sulek,  Lind,  &  Maruchek,  1995).  This  was 
considered  a  simple  rule-based  method  that  was  inexpensive  and  quickly  perfonned, 
however  this  was  also  highly  subjective  and  possibly  over-simplistic. 

The  checklist  approach  can  be  useful  when  looking  at  factors  such  as  competition 
and  consumer  spending,  demographic  composition,  and  the  comparison  of  potential  real 
estate  sites.  An  analyst  will  judge  these  factors  next  to  each  other.  However,  there  is  a 
great  deal  of  judgment  placed  on  these  factors.  This  can  be  a  good  evaluation  so  long  as 
the  analyst  is  properly  trained  in  this  field. 

The  goal  of  this  project  is  to  ultimately  identify  a  means  of  optimization  for  real 
estate  site  selection  in  a  given  market.  The  checklist  method  relies  on  a  more  artistic 
means  of  site  selection  analysis.  A  sample  outcome  of  this  approach  is  seen  in  Table  1. 
This  checklist  allows  an  analyst  to  verify  specific  attributes  about  a  location.  If  the 
analyst  was  looking  for  a  site  that  had  a  population  under  25,000,  a  median  household 
income  greater  than  $40,000,  and  over  15%  college  graduates  they  would  not  be  able  to 
find  all  attributes  based  on  this  checklist.  However,  site  4  does  have  all  but  one  of  these 
attributes.  Site  4  does  not  have  the  minimum  population  but  as  listed  as  a  potential  site. 
It  had  enough  of  the  attributes  in  the  checklist  to  be  included  as  a  potential  site.  The 
checklist  method  is  a  means  for  organizing  this  information  in  a  database. 
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Sample  Checklist 


Site  # 

Driving 

Distance 

>2 

1997 

Population 

>25,000 

Median 

Household 

Income 

>$40,000 

%  College 
Graduate 
>15% 

Possible 

Site 

Location? 

1 

No 

Yes 

No 

No 

No 

2 

Yes 

Yes 

No 

Yes 

Yes 

3 

Yes 

Yes 

No 

No 

No 

4 

Yes 

No 

Yes 

Yes 

Yes 

5 

Yes 

Yes 

Yes 

No 

Yes 

6 

Yes 

Yes 

No 

No 

NO 

7 

Yes 

NO 

No 

No 

No 

Table  1.  Sample  Checklist  for  a  Retail  Store 


The  possible  site  locations  are  chosen  based  on  a  series  of  yes/no  questions.  This  method 
can  include  some  subjectivity  depending  on  the  questions  being  utilized.  Due  to  the 
possible  subjectivity  of  this  method,  it  is  not  suitable  for  further  analysis.  Optimization 
of  an  entire  market  based  upon  subjective  questions  will  not  be  possible  as  shown  in 
Chapter  IV.  Artificial  intelligent  algorithms  will  need  an  actual  model  output.  The 
following  three  methods  (Analog,  Regression,  and  Spatial  Interaction)  will  be  able  to 
provide  sufficient  data  to  be  used  for  optimization. 

2.  The  Analog  Method 

The  Analog  Method  assesses  new  sites  by  “identifying  existing  sites  whose  trade 
area  and  general  attributes  (and  hence,  revenues)  resemble  that  expected  for  a  new 
location”  (Daniel,  1994).  These  can  be  done  based  on  the  same  company’s  stores  or  its 
competitors.  Analog  modeling  is  the  easiest  modeling  to  understand  due  to  the  fact  that 
analysts  look  at  existing  data  in  a  specific  trade  area.  A  company  can  see  what  is  already 
happening  vice  estimating  a  projection. 

Mendes  and  Themido  (2004)  explain  that  the  analog  method  is  a  “natural 
outcome”  from  the  checklist  method.  This  method  attempts  to  gain  objectivity. 
Analogous  locations  are  grouped  together  by  similar  attributes  as  “empirical 
benchmarks.”  These  attributes  could  be  driving  distance,  population,  or  percentage  of 
college  graduates  in  the  location  area.  Each  location  is  then  evaluated  within  the  group. 
The  new  site’s  location-related  parameters  are  evaluated  on  a  pre-assigned  scale.  The 
analyst  for  each  situation  would  determine  the  scale.  Store  sorting  lists  are  then  compiled 
to  rate  the  existing  stores  against  the  potentially  new  store. 
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This  method  of  real  estate  site  selection  is  predominantly  utilized  in  the  retail 
industry;  however  the  analog  method  tends  to  work  across  the  entire  spectrum  of 
industries  looking  to  expand  their  stores  or  branches.  Also,  the  analog  method  of 
modeling  site  selection  has  the  longest  proven  track  record  (Buckner,  1998).  Due  to  the 
fact  that  analogs  require  very  little  mathematical  computation  as  compared  to  regression 
and  spatial  interaction  modeling,  the  use  of  the  analog  method  may  be  utilized  more 
readily  with  fewer  resources  available. 


In  order  to  conduct  a  site  selection  analysis  utilizing  the  analog  method,  specific 
information  needs  to  be  available  to  the  analyst.  Generally  this  infonnation  comes  from 
the  company’s  own  store  point-of-sale  (POS)  data,  competitors’  data,  census  data,  and 
other  publicly  available  data.  The  analyst  will  generally  dissect  the  trade  area  into  sub- 
areas  that  will  be  analyzed  as  potential  analogous  areas  for  a  new  site.  Zip  codes  are  a 
common  form  of  this  sub-area  grouping.  Sales  are  then  estimated  for  each  sub-area  on  an 
annual  or  weekly  basis  as  determined  by  the  company.  The  capture  rate  for  each  sub-area 
is  then  calculated  by  dividing  sales  for  that  area  by  total  sales.  Table  2  provides  an 
example  of  an  analog  table  for  a  retail  store  in  a  sample  study  area.  Note  that  the 
geographic  areas  are  divided  into  zip  codes.  Total  sales  for  the  analog  are  $3,000,000. 
The  capture  rates  for  the  zip  codes  only  amount  to  74.9%.  This  means  that  25.1%  of  the 
sales  come  from  outside  of  this  trade  area.  The  information  provided  by  this  analog  table 
can  then  be  used  as  inputs  for  financial  models.  The  output  from  the  analog  is  designed 
as  a  decision  aide  that  needs  to  be  analyzed  in  comparison  to  similar  locations. 


Trade  Area  Analog  Table 


Zip  Code 

Zip  Name 

Driving 

Distance 

1997  Median 

Population  Household 
Income 

%  College 
Graduate 

Capture 

Rate 

Sales 

Per  Capita 
Sales 

48141 

Inkster 

1.2 

28,839 

$ 

27,905 

8.9% 

14.7% 

$ 

441,000 

$ 

15.29 

48124 

Dearborn 

2.2 

27,224 

$ 

39,042 

19.6% 

20.5% 

$ 

615,000 

$ 

22.59 

48125 

Dearborn  Heights 

2.6 

20,862 

$ 

35,024 

6.5% 

5.5% 

$ 

165,000 

$ 

7.91 

48128 

Dearborn 

2.8 

18,728 

$ 

44,350 

28.1% 

22.5% 

$ 

675,000 

S 

36.04 

48127 

Dearborn  Heights 

4.3 

30,274 

$ 

41,965 

14.7% 

5.7% 

$ 

171,000 

$ 

5.65 

48135 

Garden  City 

4.5 

27,239 

$ 

37,241 

9.2% 

3.5% 

$ 

105,000 

$ 

3.85 

48184 

Wayne 

5.0 

19,303 

$ 

33,002 

7.9% 

2.5% 

$ 

75,000 

$ 

3.89 

Trade  Area  Totals: 

172,469 

36,933 

13.6% 

74.9% 

$ 

2,247,000 

$ 

13.60 

Sales  from  Beyond  Trade  Area: 

25.1% 

$ 

753,000 

Grand  Total: 

100.0% 

S 

3,000,000 

Table  2.  Sample  Analog  Table  for  a  Retail  Store  (Buckner,  1998) 
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Disadvantages  of  using  analogs  can  be  seen  when  inexperienced  analysts 
manipulate  data  and  insufficient  data  is  utilized.  The  analog  method  can  also  be 
subjective.  If  an  inexperienced  analyst  fails  to  ask  specific  questions  while  validating  the 
results,  the  results  could  be  invalid.  The  analyst  needs  to  verily  that  the  average  trade 
area  penetration  levels  projected  for  a  specific  market  coincides  with  actual  average  trade 
area  penetration  levels  seen  in  the  parent  databases.  The  analyst  might  also  want  to  use 
reality-check  questions  such  as  “Does  this  make  sense  based  on  the  database?” 

The  second  potential  disadvantage  of  the  analog  method  comes  from  the  database 
itself.  If  the  company  does  not  have  sufficient  data  available,  the  analogs  may  not 
provide  useful  results.  The  results  in  the  table  above  could  easily  mislead  an  analyst  if 
insufficient  data  was  collected.  Sales  for  the  trade  area  are  stated  at  $3,000,000.  If  sales 
from  beyond  the  trade  area  had  not  been  accounted  for,  total  sales  would  only  have  been 
$2,247,000.  This  could  lead  to  incorrect  per  capita  sales.  Decisions  may  be  made  of 
these  figures  that  would  be  incorrect.  Collecting  the  right  amount  of  data  for  the  method 
of  analysis  can  decide  whether  the  analysis  will  lead  to  realistic  results. 

Overall  the  analog  method  provides  the  analyst  a  means  to  obtain  proposed  site 
sales  forecasts  from  a  historical  reference  of  similar  existing  stores.  Additional  site 
selection  techniques,  such  as  regression  and  spatial  interaction  modeling,  will  go  into 
further  depth  mathematically  and  can  potentially  provide  more  in-depth  results 

3.  Regression  Modeling 

Regression  Modeling  employs  “statistical  analysis  to  derive  linear  or  non-linear 
relationships  between  site  attributes  and  site  performance”  (Daniel,  1994).  Parameters 
such  as  distance,  competition,  and  population  are  measured  from  existing  store  locations. 
These  existing  locations  are  analogous  to  potential  new  locations.  This  data  is  then  used 
to  calibrate  a  linear  statistical  equation  (Mendes  et  ah,  2004). 

This  form  of  modeling  is  widely  used;  however  it  is  often  misused — especially  in 
all  areas  of  the  retail  industry  (Kotler,  1984).  The  large  amount  of  data  needed  to  feed 
the  equations  directly  affects  the  quality  of  the  output  of  the  regression  model.  In  order 
to  ensure  a  quality  output,  the  analyst  needs  to  obtain  many  observations.  Usually 
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companies  with  large  point  of  sales  (POS)  observations  can  utilize  these  models  more 
effectively.  This  provides  complications  for  smaller  and  mid-sized  companies  from  using 
regression  modeling  effectively  in  their  real  estate  site  selection  methods.  Many  smaller 
and  mid-sized  companies  may  not  have  sufficient  POS  data  in  locations  they  wish  to 
expand. 

Multiple  regression  modeling  provides  a  “means  to  model  how  a  variable  to  be 
predicted  varies  with  a  set  of  independent  predictors.  Thus,  the  sales  of  a  retail  store 
from  a  specific  ZIP  Code  may  be  related  to  distance  from  the  store  and  the  population  of 
that  ZIP  Code”  (Buckner,  1998).  To  further  understand  the  mechanics  of  regression 
modeling,  the  actual  statistical  method  should  be  analyzed.  The  following  is  an  example 
of  a  multiple  regression  equation  that  can  help  forecast  sales  at  any  location: 

Estimated  Sales  =  a  +  biXi  +  biX2  +  b&,3 

a  represents  the  intercept  value  or  constant 

Xi  represents  distance  as  it  relates  to  sales 

X2  represents  population  as  it  relates  to  sales 

X3  represents  competition  as  it  relates  to  sales  and 

bi,  b2,  and  b  t  represent  the  multiplicative  weighting  assigned  by  regression  to 
each  variable  (Buckner,  1998) 

For  this  equation,  the  analyst  would  be  evaluating  estimated  sales  (the  dependant 
variable)  for  a  potential  location.  The  estimated  sales  are  dependent  on  the  variables  on 
the  right  side  of  the  equation.  This  is  the  value  being  sought.  The  variables  on  the  right 
side  of  the  equation  are  considered  independent  variables  ( Xi ,  X2,  X3).  The  independent 
variables  will  be  shown  to  predict  the  value  for  estimated  sales. 

Although  this  method  of  site  selection  seems  quite  straightforward,  there  are 

many  traps  analysts  should  avoid.  Regression  modeling  requires  many  assumptions  that 

are  not  necessarily  followed  in  the  real  world.  Linear  regression  takes  data  and  applies  a 

straight-line  to  approximate  the  data  points  as  described  by  the  regression  equation.  This 

also  assumes  that  the  straight  line  is  the  best  representation  of  the  data.  A  linear 
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relationship  may  not  represent  the  data  appropriately.  A  curvilinear  projection  may  work 
better.  In  this  case,  the  independent  variables  would  need  to  be  transfonned  in  order  to 
work  in  a  linear  regression  model.  Again,  this  is  a  potential  pitfall  analysts  need  to  be 
aware  of  while  modeling. 

Another  potential  problem  is  the  independence  of  the  independent  variables.  The 
independent  variables  should  be  mutually  exclusive  and  not  be  related  to  each  other. 
High  income  and  high  education  both  can  be  considered  factors  in  bookstore  sales, 
however  they  cannot  be  thought  of  as  independent  in  contribution  to  sales.  These 
interrelations  in  the  data  should  be  avoided  in  order  to  gain  better  results. 

One  final  caution  about  regression  models  comes  from  graphing  results.  The 
results  of  a  multiple  regression  model,  as  discussed  above,  will  not  truly  fit  into  a  line.  A 
line  would  only  be  accurate  for  a  regression  using  one  independent  variable.  A  two 
independent  variable  model  would  describe  a  plane.  For  multiple  regressions, 
increasingly  more  difficult  geometries  would  be  represented.  Because  of  this,  simply 
stating  that  a  regression  line  represents  performance  alone  should  indicate  caution. 

Advantages  of  regression  modeling  include  its  ability  to  make  sense  of 
complicated  situations,  specifically  at  the  disaggregate  level  (trade  area  level).  This 
means  that  a  multiple  regression  model  would  be  able  to  sort  out  classes  of  competition 
such  as  direct  versus  indirect.  Regression  models  do  a  better  job  at  predicting  stores  that 
are  similar  to  those  supported  with  data  as  opposed  to  new  concept  stores. 

4.  Spatial  Interaction  Modeling 

Professor  David  Huff  is  considered  the  father  of  Spatial  Interaction  Modeling. 
His  research  in  the  1960’s  pioneered  this  concept  in  site  selection.  Spatial  interaction  can 
be  a  powerful  tool  but  is  underutilized  practice  due  to  complexities,  experience  required, 
and  information  needed.  Companies  are  able  to  assess  sales  by  evaluating  the  situation 
from  the  consumer’s  perspective  (Daniel,  1994).  Furthermore,  utilizing  spatial  statistics 
has  been  shown  to  provide  “more  realistic  interference,  better  prediction,  and  more 
efficient  parameter  selection”  (Pace,  Barry,  &  Sirmans,  1998). 
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Newton’s  Law  of  Gravity  states  “two  bodies  are  attracted  to  each  other  in 
proportion  to  their  mass,  and  in  inverse  proportion  to  the  square  of  the  distance  between 
then”  (Buckner,  1998).  This  law  holds  true  for  spatial  interaction  models  as  well. 
William  J.  Reilly  studied  retail  concentrations  in  cities  throughout  the  United  States  in  the 
early  part  of  the  20th  Century.  He  realized  that  there  was  a  gravitational  pull  among 
customers  and  was  able  to  model  it  into  “Reilly’s  Law  of  Retail  Gravitation.”  He  stated 
that  retail  attraction  is  directly  proportional  to  the  size  of  two  trading  areas  and  indirectly 
proportional  to  the  square  of  the  distance  between  the  two  retail  trading  centers  (Nelson, 
1958).  This  can  be  seen  as  follows: 


i+VvL 

in  which: 

d  =  the  distance,  in  miles,  on  major  roads,  between  two  adjacent  towns, 

A  and  B. 

PA  =  the  population  of  Town  A. 

PB  =  the  population  of  Town  B. 

DA-B  =  the  edge,  or  boundary  of,  Town  A’s  trading  area,  expressed  in  miles, 
toward  Town  B  from  the  center  of  Town  A. 

Reilly’s  model  takes  into  account  both  distance  and  the  attractiveness  of  other  shopping 
opportunities.  It  is  based  on  the  idea  that  agglomeration  increases  the  attractiveness  of 
stores,  and  that  high-density  areas  represent  agglomeration.  In  other  words,  shopping  in 
higher  density  areas  is  considered  more  attractive.  Reilly’s  law  of  retail  gravitation  was 
the  first  to  quantify  this  idea  for  consumers  in  a  retail  setting  based  on  Newton’s  law  of 
planetary  attraction.  The  decision  between  the  cost  of  travel  and  the  attractiveness  of 
alternate  shopping  opportunities  is  the  heart  of  this  model.  Today’s  spatial  interaction 
models  are  based  on  the  concepts  introduced  by  O’Reilly’s  gravitation  model  (Brubaker, 
2004). 
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a.  Variables 

Five  main  variables  need  to  be  adjusted  by  analysts  when  dealing  with 
spatial  interaction  models.  The  analyst  will  alter  one  or  more  of  these  variables  in  order 
to  simulate  the  trade  area  or  market  being  studied  more  accurately  (Buckner,  1998). 
Eventually  the  model  is  brought  into  a  balanced  state.  There  are  many  other  variables 
that  can  be  accounted  for  in  the  spatial  interaction  models.  However  the  following  lists 
the  five  main  variables. 

1)  Draw  accounts  for  the  percentage  of  each  of  the 
competitors  sales  for  a  given  trade  area.  A  competitor  may  have  a  site  that  is  situated  on 
the  outside  of  the  designated  trade  area.  This  would  provide  a  small  draw.  However,  if 
the  competitors  store  was  located  in  the  center  of  the  trade  area,  they  could  have  a  higher 
draw.  The  analyst  would  need  to  estimate  the  draw  in  the  initial  planning  stages  of  model 
development. 

2)  Curve  provides  an  indication  of  the  manner  by  which  the 
store’s  sales  are  distributed.  High  curve  would  denote  that  more  sales  come  from 
customers  who  live  close  to  the  location.  A  low  curve  number  would  show  the 
opposite — a  high  proportion  of  sales  come  from  customers  living  outside  the  trade  area. 
Typically  a  smaller  “mom  and  pop”  store  would  have  an  extremely  high  curve  as 
opposed  to  a  major  chain  store  like  Wal-Mart.  Major  chains  tend  to  pull  in  customers 
from  greater  distances  as  opposed  to  “mom  and  pop”  stores  (Buckner,  1998). 

3)  The  density  radius  indicates  the  geographic  extent  of  a 
given  store’s  trade  area  showing  the  number  of  people  that  are  living  within  a  given 
distance  of  the  site.  A  typical  number  for  density  radius  is  2  miles.  This  would  mean  that 
each  store  in  the  model  would  pull  customers  from  a  2  mile  radius.  This  may  or  may  not 
be  appropriate  for  every  model.  The  analyst  needs  to  take  caution  when  modeling  this 
variable. 

4)  Leakage  accounts  for  the  potential  sales  dollars  in  a  trade 
area  that  are  being  modeled  but  not  absorbed  by  one  of  the  stores  in  the  model.  The  most 
common  fonns  of  leakage  are  the  smaller,  independent  stores  in  a  trade  area.  These 
stores  are  not  easily  accounted  for  in  spatial  interaction  models.  Since  these  models  are 
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generally  expensive  to  generate  and  complex  to  maintain,  companies  are  usually  more 
worried  about  competition  from  other  regional  or  national  chains. 

5)  Image  is  the  final  major  variable  in  spatial  interaction 
models.  Image  relates  locations  against  each  other.  This  is  done  through  relative 
strength.  The  total  of  all  images  in  a  model  should  provide  an  average  image  rating  of 
100.  A  site  with  an  image  of  110  is  considered  better  than  average.  Conversely,  a  site 
whose  image  is  70  is  considered  less  desirable  than  average. 

b.  Strengths 

Spatial  interaction  models  provide  multiple  strengths.  Specifically  these 
models  do  not  require  the  development  of  a  store  database.  The  analyst  will  generally 
work  solely  with  population,  demographics,  and  competitor’s  information.  This  can  be 
good  especially  for  companies  that  do  not  have  a  great  deal  of  POS  data  or  store 
databases.  For  companies  that  are  relatively  new  to  real  estate  site  selection,  this  may  be 
seen  as  great  advantage. 

Analysts  have  the  ability,  through  spatial  interaction  models,  to  run  “what- 
if  ’  scenarios.  Upon  completion  of  a  spatial  interaction  model  for  any  given  study  area 
the  analyst  can  then  see  how  sales  would  affect  potential  and  current  stores  while 
changing  variables.  This  can  be  done  for  very  large  markets  and  trade  areas.  Instead  of 
looking  at  a  small  interaction  between  two  stores,  a  company  would  be  able  to  assess 
sales  interactions  with  a  major  metropolitan  area.  Through  the  “what-if’  scenarios,  real 
estate  decisions  could  be  evaluated  based  on  a  scientific  method.  However,  the  quality  of 
the  decision  would  rest  on  the  quality  of  the  model.  Because  of  this  reason,  the  limitation 
of  spatial  interaction  models  should  be  discussed. 

c.  Weaknesses 

Spatial  interaction  models  are  essentially  mathematical  scenarios  that  try 
to  mimic  actual  customer  actions  for  a  given  trade  area.  If  the  model  is  not  developed 
with  sound  reason  and  rigor,  the  quality  of  the  model’s  output  will  be  invalid.  The 
analyst  can  not  think  of  these  models  as  solely  gravity  based.  In  order  to  truly  gain  the 
full  power  of  spatial  interaction,  other  variables  other  than  size  and  distance  must  be 
introduced  (Buckner,  1998).  Specifically,  experience  is  important  while  working  on 
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these  models.  Experienced  analysts  typically  can  generate  more  realistic  models  based 
upon  their  experience  in  a  given  trade  area. 


The  marketplace  for  all  goods  is  ever-changing.  New  products  and  stores 
are  introduced  in  attempt  to  gain  a  portion  of  a  market.  For  example,  specialty  stores 
have  opened  competing  with  multiple  types  of  competitors.  Food/drug  stores  such  as 
Fongs,  natural  stores  such  as  Whole  Foods,  and  hypermarkets  such  as  Wal-Mart  all  have 
overlap  in  some  of  their  products.  Some  customers  will  drive  10  miles  further  to  shop  at 
a  Wal-Mart  when  a  Fongs  is  one  mile  away  from  their  house.  Attempting  to  model  this 
consumer  desire  can  be  difficult.  Not  all  stores  cater  to  the  same  consumer,  yet  they  may 
have  similar  products.  Understanding  these  differences  is  essential  in  producing  a  quality 
spatial  interaction  model. 

Overall,  spatial  interaction  models  can  be  powerful  tools  for  predicting 
consumer  gravitation.  Real  estate  site  selection  analysts  need  to  be  aware  of  all  the 
strengths  and  weaknesses  when  using  these  models.  If  an  analyst  were  to  make  an 
incorrect  assumption,  the  results  could  be  incorrect.  Proper  training  and  experience  is 
necessary  prior  to  working  on  these  models.  This  is  a  key  factor  as  to  why  many 
companies  are  not  currently  using  these  models  within  their  real  estate  departments. 


D.  REAL  ESTATE  SITE  SELECTION  EXAMPLES 

In  order  to  gain  a  better  understanding  of  how  real  estate  site  selection  is 
conducted  in  practice,  a  real-world  example  is  provided.  As  noted  in  the  previous 
section,  analog  and  regression  models  are  the  most  widely  utilized  in  practice  and, 
therefore,  will  be  shown  in  this  example.  The  checklist  method  relies  on  too  much 
subjectivity  and  is  not  able  to  be  optimized  for  an  entire  market.  The  spatial  interaction 
models  are  more  accurate,  however  they  require  more  stringent  data  requirements  that 
may  not  readily  available  at  most  companies.  The  company  shown  in  this  example  did 
not  utilize  spatial  interaction  modeling. 

Analyzing  the  variables  utilized  in  analog  and  regression  models,  along  with  their 
strengths  and  weaknesses,  will  allow  a  better  differentiation  between  the  methods.  This 
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better  understanding  will  also  show  when  a  certain  method  would  be  better  to  use  than 
the  other.  Ultimately  this  will  show  how  each  method  of  site  selection  only  analyzes 
specific  locations  within  a  market.  It  is  always  important  to  look  at  a  single  location 
within  a  market;  however  analyzing  the  entire  market  will  provide  insight  in  ways  to 
optimize  profit  for  the  entire  market  vice  a  single  location.  This  can  only  be  done  though 
advanced  optimization  techniques  such  as  artificial  intelligent  algorithms  (specifically 
genetic  algorithms).  The  following  example  provides  further  insight  into  the 
predominately-utilized  methods  of  real  estate  site  selection. 

The  information  for  this  example  comes  from  a  Fortune  500  retail 
company.  The  company  and  analyst  shall  remain  anonymous  throughout 
this  project. 

1.  The  Scenario 

A  major  retailer,  who  currently  has  stores  of  various  sizes  world- wide,  was 
interested  in  maximizing  profit  in  the  Denver,  Colorado  area  based  on  current  and  future 
site  locations  within  the  study  area.  In  order  to  gain  a  better  understanding  of  what  mix 
of  stores  would  be  optimal  for  Denver,  they  utilized  an  outside  source  to  assist  in  their 
market  planning. 

Figure  4  shows  the  company’s  existing  trade  area  in  Denver.  The  classifications 
of  sites  are  noted  in  the  legend.  “Approved  stores”  are  stores  that  are  currently  open  and 
have  been  approved  to  stay  open.  “Hit  list  void”  locations  are  sites  that  have  been 
deemed  deficient  in  the  current  market.  This  came  from  previous  analysis  of  the  trade 
area.  These  sites  have  specific  properties  associated  with  them.  These  could  be  malls, 
lifestyle  centers,  outlets,  etc.  “In/Out”  sites  are  the  locations  to  be  analyzed  within  the 
site  selection  analysis.  They  are  sites  that  can  be  either  included  in  future  planning  by 
remaining  open  or  excluded  by  closing  the  store.  “Opt  Void”  stores  are  areas  that  have 
noted  voids  in  the  specific  market  from  previous  analysis.  There  are  not  specific  site 
locations  associated  with  these  voids.  They  are  only  utilized  to  show  areas  of  future 
growth  potential.  Finally  the  figure  shows  “Sacred  Cows”.  These  are  stores  that  are 
performing  well.  They  are  not  to  be  considered  for  any  changes  in  the  market  planning 
process. 
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Figure  4.  Denver  trade  area  with  current  and  potential  store  mix 


The  company  wanted  to  look  at  their  current  mix  of  stores.  They  were  interested 
in  whether  they  should  open  new  stores,  close  existing  stores,  or  keep  the  current  mix  as- 
is.  Keeping  a  store  open  was  called  “in.”  Closing  a  store  was  called  “out.”  Before  going 
into  the  market  planning  process,  the  company  set  guidelines  to  help  them  out.  These 
guidelines  included  the  following  list  of  parameters: 

•  “Sacred  Cows”  should  not  be  considered  in  the  in/out  process. 

•  Focus  on  “Hit  list  voids”  when  conducting  market  analysis. 

•  Internal  Rate  of  Return  (IRR)  was  utilized  for  a  hurdle  rate  at  a 
minimum  of  25%. 

•  Net  Present  Value  (NPV)  of  existing  and  potential  stores  was 
calculated  using  a  lifespan  of  15  years. 

This  list  of  guidelines  helped  the  third-party  analysts  conduct  market  analysis  for  the 
Denver  trade  area. 
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2. 


The  Process 


The  analysts  developed  a  method  to  explore  possible  options  in  the  Denver  area. 
They  were  given  guidelines  (above)  and  had  access  to  the  company’s  current  store 
database  and  POS  information.  This  combined  with  population  data  provided  sufficient 
information  to  develop  a  plan. 

The  analysts  utilized  the  company’s  store  databases,  POS  data,  and  population 
data  to  run  34  scenarios  for  the  Denver  market.  A  set  of  analogous  stores  were  first 
pulled  from  the  company’s  databases.  These  stores  matched  specified  search  criteria  that 
mimicked  a  potential  site’s  characteristics.  One  of  the  34  scenarios  is  presented  in  this 
project.  This  scenario  shows  the  process  of  opening  a  core  format  store  in  Northlands. 
Pearl  Street  in  Boulder,  Colorado  is  also  closed  in  this  scenario.  Pearl  Street  is  a  large 
women’s  store  in  a  street  shopping  location. 

a.  Use  of  Analogs 

The  main  concept  behind  the  use  of  analog  modeling  is  that  the  sales  of  a 
proposed  store  will  perfonn  similarly  to  other  stores  that  exist  within  a  company. 
Analysts  must  choose  these  similar  stores  based  on  a  set  list  of  criteria.  Ideally,  an 
analyst  would  want  only  analogs  that  strongly  match  the  proposed  store’s  variables.  This 
may  not  always  be  feasible.  The  following  steps  outline  the  general  approach  used  by 
this  company  to  choose  analog  sites.  The  analyst  would  start  at  step  1  and  move  towards 
step  4. 

1 .  Restrict  the  search  to  match  the  location  type,  format  of  store,  market,  and 
range  of  density  class. 

2.  If  more  matches  are  needed,  relax  the  market  constraint  allowing  market 
class  or  market  class  range  within  the  same  region. 

3.  Refine  the  search  by  range  of  trade  area  population. 

4.  Refine  the  search  by  range  of  effective  population  score  and/or 
competition. 

This  list  shows  that  the  most  important  factors  are  location,  format,  market,  and  range  of 

density  class.  However,  these  factors  alone  must  not  be  the  only  input  into  the  model. 

The  location  of  the  analog  should  be  similar.  This  means  that  a  rural  analog  should  be 

used  for  a  proposed  rural  store.  The  format  must  be  similar  also.  A  large  style  format 
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should  not  be  used  as  an  analog  for  a  proposed  smaller  store.  The  same  thought  process 
should  be  used  for  the  type  of  market  (population  profile,  etc).  Upon  finalizing  the 
analog  set,  the  analyst  should  begin  to  compare  essential  information  for  the  proposed 
site  along  with  the  analog  stores. 

Figure  5  shows  the  analog  set  for  the  proposed  Northlands  store.  The 
analogs  similar  to  the  proposed  store  come  from  all  over  the  United  States.  Stores 
included  come  from  Knoxville,  Syracuse,  Sarasota,  Fort  Lauderdale,  and  other  cities 
nowhere  near  Denver,  CO.  Within  this  analog  sales  forecast  model  two  specific  factors 
are  evaluated  closely:  $/Eff  Cap  and  Capture  Rate.  $/Eff  Cap  means  the  dollars  spent 
per  the  effective  capita.  The  total  dollars  spent  in  a  given  area  are  divided  by  the 
effective  trade  area  population.  The  effective  trade  area  population  is  calculated  based 
upon  a  population  index.  The  average  index  would  be  1 .0,  however  if  the  all  the  trade 
area  households  spent  more  than  most  the  index  would  be  greater  than  1.0.  This  analog 
model  only  has  one  analog  with  a  population  index  great  than  1.0:  Hartford,  CT. 
Hartford’s  population  index  is  1.29.  For  this  analog  model,  the  $/Eff  Cap  was  $5.21. 

The  Capture  Rate  shown  in  this  analog  model  was  69.8%.  In  other  words, 
69.8%  of  the  sales  for  this  model  came  from  within  the  trade  area.  This  would  mean  that 
30.2%  of  the  sales  came  from  outside  the  trade  area  for  this  analog  model. 

Thirty  three  other  analog  models  were  run  for  this  trade  area.  The  analog 
model  presented  here  only  shows  the  analogs  for  the  site  that  was  selected  to  open: 
Northlands.  The  analysts  working  on  this  scenario  conducted  the  same  procedures  for  the 
other  alternatives  within  the  trade  area.  Ultimately  Northlands  was  chosen  to  open  based 
upon  the  company’s  goals.  The  presented  analog  provided  the  optimal  results  out  of  the 
34  analogs  conducted.  The  capture  rate  and  $/Eff  Cap  were  higher  for  this  analog  model, 
but  the  decision  was  not  based  up  on  the  analog  model  alone.  The  analysts  also  looked  at 
regression  models  to  help  ascertain  with  the  best  scenarios  would  be.  The  following 
section  will  discuss  how  regression  models  were  used  in  this  scenario. 
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Figure  5.  Analogs  for  Northlands  site 
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b. 


Use  of  Regression 


The  real  estate  analysts  utilized  a  regression  model  to  help  show  sales 
forecasts  for  the  proposed  Northlands  store.  These  regression  models  are  considered 
simple  because  sales  are  only  a  function  of  distance.  The  company  has  many  regression 
curves  relating  average  sales  per  effective  population  to  straight  line  distance  from  a 
given  store.  These  curves  are  grouped  together  based  upon  store  type.  For  example, 
there  are  a  set  of  regression  curves  for  mall  type  stores  and  another  for  lifestyle  center 
stores.  These  curves  are  further  subdivided  into  market  type.  A  possible  set  of  curves 
could  be  mall  locations  in  medium  markets  (based  upon  the  company’s  definition  of  a 
medium  market). 

Figure  6  shows  four  regression  outputs  for  the  proposed  Northlands  store 
below  the  analog  model  output.  The  four  regressions  are  system  isolation,  analyst 
isolation,  system  sequential,  and  analyst  sequential.  The  system  outputs  depict  exactly 
what  the  regression  curves  dictate  based  upon  the  company’s  databases.  The  company 
utilizes  store  data,  POS  data,  and  census  data  to  populate  its  databases.  The  analyst 
outputs  take  into  account  user  knowledge.  The  analysts  may  have  other  knowledge  of  the 
area  or  the  regressions  and  accounted  for  them  in  their  adjusted  regression  output.  The 
isolation  or  sequential  outputs  account  for  how  the  model  was  run.  Regression  models 
run  in  isolation  assume  the  store  is  opening  up  without  any  other  stores  in  surrounding 
areas.  The  sequential  models  assume  that  other  stores  are  currently  open  or  will  be 
opening  and  will  cannibalize  some  of  this  store’s  sales.  In  this  example  there  was  no 
difference  between  the  isolated  or  sequential  models. 
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Figure  6.  Regression  results  for  Northlands  site 
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The  regression  model  outputs  show  higher  $/Eff  Cap  for  both  the  system 
and  analyst  models.  However,  the  system  accounts  for  a  73%  capture  rate.  The  analyst 
shows  a  65%  capture  rate.  The  analog’s  capture  rate  was  69.8%.  Part  of  the  reason  for 
the  analysts  difference  in  $/Eff  Cap  and  Capture  Rate  come  from  their  adjustment  of 
drive  time  (distance)  and  trade  area.  Figure  7  shows  the  system’s  automated  trade  area. 
The  trade  area  is  outlined  in  burgundy. 


Figure  7.  System  Trade  Area 

Figure  8  shows  the  analysts  adjusted  trade  area.  The  analysts  felt  that  the  proposed 
Northlands  store  would  draw  customers  from  the  northern  areas  more  so  than  the  systems 
projection.  The  analyst  adjusted  the  trade  area  to  represent  their  assumptions.  This  can 
be  seen  below  outlined  in  burgundy. 
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Figure  8.  Analyst  Adjusted  Trade  Area 

After  running  analog  and  regression  models  for  the  proposed  Northlands 
store,  the  analyst  has  a  relatively  good  idea  of  how  the  proposed  store  will  perform.  This 
is  all  based  upon  the  company’s  previously  recorded  historical  data.  There  are  other 
factors  to  assess  prior  to  making  an  ultimate  decision.  Also,  keep  in  mind  that  this  same 
process  was  done  for  33  other  proposed  locations. 

c.  Cannibalization  / Recapture 

Opening  a  new  store  at  the  Northlands  location  would  potentially  affect 
other  stores  in  the  trade  area.  The  new  store  would  cannibalize  on  the  sales  of  current 
stores.  This  would  decrease  other  sales  while  adding  to  the  sales  of  the  new  store.  This 
affect  is  referred  to  as  cannibalization.  The  same  can  be  said  for  the  opposite  process.  If 
the  company  were  to  close  a  store,  the  surrounding  stores  in  a  given  trade  area  may  see  a 
possible  increase  in  sales  due  to  recaptured  customers.  Customers  would  not  have  their 
usual  store  to  shop  at  in  the  trade  area.  They  would  possibly  go  to  another  store  in  the 
trade  area.  This  store  would  recapture  this  customer’s  sales  for  the  company. 


38 


Figure  9,  below,  shows  the  impact  of  opening  the  Northlands  store  on  the 
other  stores  in  this  market.  For  example,  the  Flatirons  Crossings  store,  which  is  a  full 
format  mall  store,  would  lose  6.5%  of  total  sales  based  upon  Northlands  opening  as 
shown  in  Figure  9.  Referring  to  Denver  trade  area  map  above,  this  would  potentially 
make  sense.  The  Flatirons  Crossings  store  is  closest  to  the  proposed  Northlands  store. 
The  stores  closest  to  the  Northlands  location  should  be  cannibalized  the  most  generally. 
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Figure  9.  Northlands’  cannibalization/recapture 


3.  The  Results 

The  ultimate  results  of  this  scenario  opened  the  Northlands  store  and  closed  the 
Pearl  Street  store.  Figure  10  shows  the  Denver  selection  results.  These  results 
summarize  the  trade  area  based  on  the  actions  discussed  here.  Each  store  and  void 
(potential  site)  is  listed  with  a  Year  0  sales,  IRR  (internal  rate  of  return),  and  NPV  (net 
present  value).  The  baseline  figures  are  for  what  the  company  currently  has  in  existence 
for  the  study  area.  The  recommend  scenario  figures  show  what  the  recommend  scenario 
will  provide.  The  final  set  of  figures  show  the  variance  between  the  two.  The  scenario 
that  was  run  that  produced  the  highest  NPV  is  also  shown  for  comparison. 

The  results  from  this  example  are  intended  for  use  in  the  decision  process.  These 
are  not  financial  models.  As  discussed  earlier,  these  results  may  not  be  the  optimal 
results  for  the  study  area.  The  scenario  chosen  as  the  best  scenario  does  provide  better 
results  than  the  other  proposed.  It  has  a  higher  IRR  at  31.9%.  This  is  decrease  in  IRR 
from  the  baseline  IRR  of  33.0%,  however  the  recommended  scenario  does  have  a  higher 
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NPV  of  $1,511,052  over  the  baseline  NPV.  There  is  a  tradeoff  with  this  scenario.  The 
scenario  with  the  highest  NPV  has  a  smaller  IRR  of  30.1%.  Each  scenario  must  be 
looked  at  in  comparison  to  the  other.  Even  though  the  recommended  scenario  did  not 
have  the  highest  NPV,  it  was  considered  better  than  the  scenario  with  the  highest  NPV. 
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E. 


CHAPTER  SUMMARY 


This  chapter  provided  a  basis  for  understanding  real  estate  site  selection  processes 
utilized  in  industry.  A  model  for  this  process  was  utilized  showing  the  vast  number  of 
factors  that  need  to  be  managed  during  this  process.  Additionally,  four  methods  for 
factor  management  were  presented.  Of  these  four  methods  only  three  are  capable  of 
being  optimized  to  provide  maximum  ROI  in  a  given  market  (analog,  regression,  and 
spatial  interaction).  Optimization  will  be  integral  in  truly  ascertaining  the  optimal  mix  of 
locations  in  a  given  market.  This  chapter  showed  methods  to  analyze  potential  and 
current  sites,  however  these  methods  only  provide  a  means  for  looking  at  sites  in 
isolation  or  with  minimal  impact  for  other  sites.  In  order  to  gain  a  better  understanding 
of  the  study  area  and  how  all  sites  interact  with  each  other,  further  optimization  is 
required.  The  next  chapter  will  provide  insight  into  artificial  intelligent  algorithms, 
specifically  genetic  algorithms,  as  a  form  of  optimization.  These  optimizations  will  serve 
as  a  means  to  optimize  multiple  site  locations  within  a  market  to  produce  maximum 
profit. 
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IV.  APPLICATION  OF  ARTIFICIAL  INTELLIGENCE 


A.  INTRODUCTION 

As  more  companies  enter  an  industry,  competition  for  sales  increases.  Having  a 
store  in  an  optimal  location  can  help  with  sales.  In  order  to  stay  competitive  companies 
are  also  going  to  have  to  become  more  efficient.  Currently,  companies  are  already 
optimizing  store  allocation  and  workforce.  One  area  that  many  companies  do  not 
optimize  yet  is  site  selection. 

This  chapter  will  focus  on  methods  to  optimize  site  selections.  Locating  ideal 
sites  through  computer  modeling  may  not  be  good  enough  in  today’s  market.  Finding 
ways  to  optimize  all  facets  of  an  organization  will  be  an  important  venture.  Optimization 
utilizing  artificial  intelligence  will  afford  three  main  objectives:  (1)  determine  potential 
in  the  market,  (2)  highlight  voids  in  the  market  where  possible  expansion  could  occur, 
and  (3)  maximize  profit  in  the  market  based  upon  consumer  actions.  Using  these  three 
objectives  as  guides  will  allow  a  better  understanding  of  the  potential  for  artificial 
intelligent  algorithms  in  optimization. 

First,  the  aspect  of  optimization  will  be  discussed.  The  example  of  real  estate  site 
selection  in  Denver,  CO  in  the  previous  chapter  will  be  used  as  a  basis.  Pitfalls  in  the 
previous  example  and  ways  to  improve  the  process  will  be  shown.  Specifically,  artificial 
intelligence  for  optimization  will  be  discussed.  This  is  the  second  portion  of  this  chapter. 
Artificial  intelligent  algorithms  can  optimize  site  location  analyses  while  handling  all 
parameters  needed.  Genetic  algorithms,  a  type  of  artificial  intelligent  algorithms,  have 
been  shown  to  be  the  most  effect  type  of  optimization.  Currently,  most  companies  are 
not  optimizing  at  all.  Improvements  will  be  recommended  for  truly  gaining  an  optimized 
location  analysis. 


B.  OPTIMIZATION 

In  the  previous  chapter,  four  unique  real  estate  site  selection  methods  were 
discussed.  Each  method  concentrated  on  one  specific  site  in  isolation.  The  output  of  a 
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spatial  interaction  model  could  possibly  provide  information  on  the  following  (Chipman, 
2005): 

•  Sales 

•  Revenue  transfer 

•  Trade  area  (or  study  area)  extent 

However,  these  spatial  interaction  model  outputs  are  in  isolation.  Each  model  simulation 
reflects  one  possible  scenario  in  a  study  area.  The  impact  of  multiple  actions  taking  place 
in  a  given  study  area  are  not  accounted  for  with  tradition  model  outputs.  In  order  to 
ascertain  how  multiple  actions  will  affect  the  study  area  at  the  aggregate  level,  further 
optimization  of  model  outputs  needs  to  be  conducted.  Companies  may  want  to  look  at 
how  different  actions  will  affect  the  revenue  for  a  specific  area.  Deciding  whether  to 
open  or  close  stores  will  depend  on  what  is  best  for  the  study  area  in  whole.  Only 
looking  at  one  action  in  isolation  will  not  give  enough  information.  Analysts  would  need 
to  run  a  potentially  large  number  of  scenarios  to  account  for  all  possible  actions  in  a  study 
area. 

For  example,  within  any  given  market  a  company  could  have  20  stores  currently 
in  existence.  They  could  also  be  interested  in  an  additional  20  potential  sites  to  open.  In 
order  to  fully  analyze  the  market,  an  analyst  would  need  to  run  a  spatial  interaction  model 
for  each  possible  scenario.  Each  scenario  would  show  the  affects  of  opening  and/or 
closing  stores  in  that  particular  market.  The  analyst  would  need  to  run  every  possible 
scenario  and  evaluate  the  potential  of  each  scenario.  Then,  the  analyst  would  need  to 
compare  all  scenarios  and  detennine  which  scenario  was  best  for  the  given  study  area  in 
order  to  find  the  optimal  mix  of  open  and  closed  stores.  The  market  could  potentially 
change  within  the  amount  of  time  it  would  take  to  run  this  many  scenarios.  However, 
methods  are  available  to  find  an  optimal  market  configuration.  Artificial  intelligent 
algorithms  are  such  methods  to  optimize  markets  without  analyst  intervention.  This 
would  allow  a  company  to  account  for  all  possible  outcomes  while  producing  an 
optimized  mix  new  and  existing  stores  within  a  potential  market. 

The  scenario  in  the  previous  chapter  is  an  example  that  could  be  optimized  in 

order  to  determine  the  optimum  market  mixture  of  stores  for  the  company.  The  results  of 

44 


the  example  had  the  Northlands  site  opening  and  Pearl  Street  closing.  However,  this  was 
based  on  34  separate  scenarios.  The  potential  interactions  between  all  sites  were  not 
evaluated.  34  scenarios  were  run  which  did  not  include  every  possible  market 
configuration.  The  outcome  was  the  optimal  result  based  on  the  scenarios  that  were  run, 
however  this  may  not  have  been  the  optimal  result  for  the  market  as  a  whole.  Each 
scenario  looked  at  one  site  and  its  affects  on  the  surrounding  sites  in  isolation.  Artificial 
intelligent  algorithms  would  have  been  able  to  run  these  scenarios  in  tandem  allowing  a 
model  output  that  takes  into  account  all  possible  actions.  The  decision  to  open  the 
Northlands  location  and  close  Pearl  Street  may  be  the  optimized  result,  however  until  the 
scenario  is  optimized  the  true  answer  will  never  be  known. 

There  are  many  types  of  optimization  techniques  that  are  utilized  to  find  a  result 
for  a  given  problem.  However,  depending  on  the  type  of  problem  at  hand,  the 
optimization  technique  utilized  may  not  provide  the  desired  results.  Standard  algorithms 
are  a  common  means  to  find  optimized  results  for  problems.  Algorithms  are 
mathematical  procedures  with  a  finite  set  of  instructions.  They  take  an  initial  state, 
process  them  through  the  set  instructions,  and  provide  a  finite  end-state.  An  algorithm 
will  look  at  a  finite  maximum  or  minimum  based  on  the  instructions  provided. 
Unfortunately  standard  algorithms  will  not  satisfy  the  real  estate  site  selection  problem 
due  the  problem’s  complexity.  Complex  problems,  such  as  site  selection,  will  generate 
many  local  maxima  and  minima.  For  this  reason,  more  complex  optimization  procedures 
are  required.  These  optimization  procedures  will  be  able  to  take  into  account  local 
maxima  and  minima.  The  absolute  maximum  and/or  minimum  will  be  optimized  for  the 
problem. 

1.  Artificial  Intelligence 

Artificial  intelligent  (AI)  algorithms  can  be  used  as  an  optimization  technique  to 
account  for  simplistic  algorithm  shortcomings.  AI  automates  decision  trees  quickly 
alleviating  the  need  for  an  analyst  to  conduct  multiple  scenario  evaluations  in  isolation. 
Utilizing  AI  algorithms  allows  companies  to  optimize  site  allocation  while  including  all 
pertinent  parameters.  Densham  (1991)  shows  that  AI  algorithms  are  quicker  than  exact 
methods  and  can  include  multiple  objective  functions,  or  models,  to  analyze  the  data. 
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However,  he  also  points  out  a  drawback  in  the  use  of  AI  algorithms — they  are  not 
necessarily  exact.  These  methods  attempt  to  find  absolute  maxima  and/or  minima.  The 
greater  number  local  maxima  and/or  minima  analyzed  by  the  AI  algorithm  will  provide  a 
better  estimate  of  the  optimized  solution  to  the  site  selection  problem.  Overall,  AI 
algorithms  can  solve  these  problems  quicker  than  traditional  linear  methods  for  problems 
with  more  than  25  existing  sites  (Houck,  Joines,  &  Kay,  1996). 

There  are  many  types  of  AI  algorithms  available  for  solving  the  site  selection 
problem.  Each  of  these  AI  algorithms  will  need  objective  functions  to  obtain  data.  The 
models  discussed  in  the  pervious  chapter  represent  the  object  functions  required  to  feed 
into  the  AI  algorithm.  Figure  1 1  shows  how  the  model  output  is  fed  into  the  artificial 
intelligent  algorithm  to  obtain  optimized  results  (Chipman,  2005). 


Site  Location  Analysis 


Checklist  Method  |- 

-|  Analog  Method 

Regression  Modeling 
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Algorithm 
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Figure  1 1 .  Application  of  Artificial  Intelligence 


The  AI  algorithm  will  take  the  model  outputs  and  analyze  the  data  to  find  local  maxima 
and/or  minima  depending  on  the  desired  outcome  of  the  scenario.  If  the  goal  was  to 
maximize  profit,  maximums  would  be  evaluated.  If  costs  were  to  be  minimized, 
minimums  would  be  evaluated. 
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Three  of  the  AI  algorithms  that  could  analyze  this  data  are  Random  Restart,  Two- 
Opt  Switching,  and  Genetic  Algorithms.  This  project  will  focus  specifically  on  the  use  of 
genetic  algorithms  as  a  form  of  artificial  intelligence.  Genetic  algorithms  have  been 
shown  to  “provide  better  solutions  than  either  of  the  traditional  procedures... and  with 
less  computational  effort”  (Houck  et.al.,  1996).  When  dealing  with  problems  utilizing 
increasingly  larger  numbers  of  parameters  demands,  genetic  algorithms  have  been  shown 
optimize  solutions  quicker  and  with  less  computation  power  (Kim,  2001).  In  other 
words,  genetic  algorithms  become  more  useful  in  optimization  as  the  number  of 
parameters  increase.  Other  forms  of  AI  algorithms  require  more  computer  power  and 
take  longer  to  complete. 

The  use  of  genetic  algorithms  has  been  seen  as  an  advantage  for  over  ten  years. 
Church  and  Sorensen  saw  the  potential  for  genetic  algorithms  in  1994  stating: 

Even  though  the  genetic  algorithm  can  produce  extremely  good  results, 
solution  times  are  usually  much  larger  than  other  techniques.  Such  a 
process  might  be  a  candidate  when  computational  resources  are  very 
large... 

This  was  written  in  1994.  At  that  time  researches  were  first  realizing  genetic  algorithms 
and  their  potential.  Houck,  Joines,  and  Kay  (1996)  also  were  able  to  prove  the  potential 
for  genetic  algorithms.  They  performed  seven  tests  specifically  designed  to  show  quality 
of  the  solution  obtained  and  the  computational  efficiency  for  the  solution.  Smaller-sized 
problems  (less  than  25  initial  sites)  showed  little  performance  difference  in  multiple  AI 
methods.  However  the  ability  of  genetic  algorithms,  specifically  through  the  use  of 
genetic  crossover  operations,  to  obtain  a  better  solution  by  combining  parts  of  two 
existing  solutions  provided  more  optimal  solutions  overall  with  equivalent  computational 
effort.  For  these  reasons,  only  genetic  algorithms  will  be  examined  in  this  project. 

2.  Genetic  Algorithms 

Genetic  algorithms  were  founded  on  the  principle  of  biological  evolution, 
such  as  survival  of  the  fittest.  Genetic  algorithm  techniques  are  applicable 
to  many  difficult  optimization  problems  by  using  evolutionary  parameters 
(i.e.  population  and  generation  sizes),  genetic  operators  (i.e.  crossover  and 
mutation)  and  other  evolution  control  criteria  (i.e.  selection,  pressure, 
tennination  condition  and  fitness  scaling).  (Kim,  2001). 
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Young-Hoon  Kim’s  quote  describes  genetic  algorithms  in  a  context  showing 
similarities  to  biological  evolution.  This  is  the  basis  for  genetic  algorithms.  If  an 
optimization  is  not  better  than  its  predecessors,  it  is  thrown  out.  Conversely,  if  the 
solution  is  better  than  its  predecessors,  it  is  kept.  This  process  continues  until  an  optimal 
optimum  solution  is  provided. 

Hurley,  Moutinho,  and  Stephens  (1995)  illustrate  the  use  of  genetic  algorithms 
with  goal  of  solving  the  following  three  expectations: 

1 .  Identify  the  appropriate  number  of  sites  to  use  within  a  market. 

2.  Identification  of  sites  to  be  removed  from  the  exiting  network  of  stores 
or  sites.  In  other  words,  identify  sites  that  should  be  closed  to  obtain 
an  optimum  market  based  on  consumer  actions. 

3.  Find  the  best  subset  of  sites  to  market  a  specific  product  or  service. 
This  could  be  retail  clothing,  restaurants,  oil-change  stores  for  cars,  or 
any  other  product  or  service  desired  by  the  company. 

These  expectations  are  best  sought  after  when  analyzing  a  market  that  contains  multiple 
current  and  potential  sites.  Hurley  et  al.  (1995)  base  their  examples  on  a  study  area  with 
50  or  more  sites  and  several  other  proposed  sites.  This  does  not  mean  that  results  will  be 
incorrect  for  smaller  markets.  This  solely  means  that  larger  markets  need  to  be  analyzed 
in  greater  depth  and  that  genetic  algorithms  can  do  a  great  deal  of  the  work  for  the  analyst 
in  a  relatively  shorter  amount  of  time. 

Figure  12  outlines  the  flowchart  in  the  process  for  genetic  algorithms  (Kim, 
2001).  Kim  shows  the  basic  structure  of  genetic  algorithms  during  computations.  These 
algorithms  treat  each  problem  as  an  evolutionary  sequence.  Ultimately,  if  the  output  is 
evaluated  as  more  desirable  than  the  initialization  the  process  will  tenninate. 
Termination  will  be  based  upon  pre-defined  termination  rules.  However,  this  process 
must  be  run  for  every  possible  scenario. 
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Source:  Young-Hoon  Kim’s  Identifying  evolutionary  searching  mechanism  of  genetic  algorithms  for  regional  science 
modeling  from  the  Sheffield  Centre  for  Geographic  Information  and  Spatial  Analysis  (SCGISA),  2001. 

Figure  12.  Genetic  Algorithm  Structure  Flowchart 

Within  the  genetic  algorithm  flowchart  each  block  represents  a  function  of  the 
computerized  process.  Each  time  the  genetic  algorithm  goes  through  this  process  a  new 
generation  is  created.  One  cycle  through  the  flowchart  is  considered  a  generation.  Kim 
(2001)  continues  to  explain  the  nomenclature  for  this  process.  The  AI  algorithm  gains  its 
nomenclature  from  biological  terms.  The  initial  set  of  for  the  random  solution  is  called 
the  population.  A  population  is  made  up  of  a  set  of  chromosomes.  Each  chromosome 
represents  a  set  of  locations  which  are  called  genes.  A  gene  in  a  given  chromosome  can 
be  represented  by  a  binary  bit  that  can  be  switched  on  (1)  or  off  (0).  The  following  is  an 
example  of  a  potential  chromosome  that  has  10  sites: 

1001100101  (1 0-site  chromosome) 

This  chromosome  is  made  of  10  genes  that  represent  sites  in  a  study  area.  The  first 
binary  digit  on  the  left  is  a  “1”.  This  means  that  site  1  is  open.  The  second  binary  digit 
from  the  left  is  a  “0”.  This  means  that  site  2  is  closed.  This  chromosome  is  a  potential 
solution  for  a  study  area  of  10  sites. 
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a. 


Initialization 


The  initialization  creates  the  starting  point  for  the  genetic  algorithm  in  the 
process.  The  required  data  for  a  given  market  or  study  area  can  be  formulated  randomly 
or  by  using  specific  infonnation  about  the  problem  at  hand.  At  this  stage  the  analyst  will 
set  up  rules  for  the  genetic  algorithm.  The  number  of  chromosomes  in  the  population, 
crossover  and  mutation  data  (to  be  discussed  later),  and  termination  rules  will  all  be  set 
up  during  the  initialization  process.  Once  the  population  is  initialized,  the  chromosomes 
can  be  encoded. 

b.  Encoding 

Each  individual  site  in  the  population  is  represented  by  genes. 
Chromosomes  will  have  sites  encoded  as  on  or  off  depending  on  the  initialization  rules. 
This  is  only  a  starting  point  for  the  optimization  process.  As  the  process  progresses, 
chromosomes  will  be  examined  and  changed  based  upon  pre-determined  rules  while 
searching  for  an  optimized  outcome. 

c.  Evaluation 

Each  chromosome  in  the  population  is  evaluated  against  set  probabilistic 
rules.  Chromosomes  that  produce  better  outcomes  for  the  study  area  have  a  great  chance 
of  being  selected  for  the  next  generation.  Less  superior  chromosomes  will  be  discarded. 
However,  each  evaluation  creates  local  maxima  and/or  minima.  This  means  that  just 
because  a  set  of  chromosomes  are  optimal  in  one  generation,  they  may  not  be  optimal  in 
the  next.  To  prevent  a  local  maximum  from  occurring,  some  less  optimal  chromosomes 
will  survive  the  evaluation  and  move  on  to  the  next  generation. 

d.  Probabilistic  Rules 

The  probabilistic  rules  come  from  the  models  discussed  in  the  previous 
chapter.  These  objective  functions  test  the  chromosomes  for  their  suitability.  As  the 
genetic  algorithms  progress,  the  overall  suitability,  fitness,  of  the  better  chromosomes 
should  increase.  Also,  the  overall  fitness  of  the  population  in  general  should  increase  as 
future  generations  are  created. 
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In  the  case  of  site  selection,  a  company  would  be  looking  to  maximize 
revenue  by  opening  and/or  closing  stores  to  obtain  an  optimum  mix  of  stores  in  a  market. 
At  the  same  time,  the  company  would  also  want  to  minimize  the  impacts  of 
cannibalization.  This  can  be  done  through  the  use  of  site  selection  models  discussed  in 
the  previous  chapter  looking  to  examine  sales  forecasts  and/or  cannibalization. 

If  the  genetic  algorithm  was  looking  to  optimize  sales  forecast,  it  could 
use  an  analog  model  for  its  probabilistic  rules.  Each  gene  in  a  given  chromosome  would 
be  evaluated  against  a  set  of  analogous  sites.  The  gene,  or  site,  would  be  assigned  a 
revenue  based  upon  the  existing  set  of  similar  stores  in  the  analog  database.  Each  gene’s 
revenue  in  the  chromosome  that  is  turned  on,  set  at  1 ,  would  be  utilized  to  determine  the 
fitness  of  the  chromosome. 

e.  Crossover  and  Mutation 

During  this  phase,  new  chromosomes  are  formed.  These  new 
chromosomes  may  carry  traits  similar  to  their  parent  chromosomes.  Crossover  and 
mutation  represent  the  evolutionary  aspect  of  genetic  algorithms  by  simulating  the 
biological  process  of  genetics.  Just  as  a  male  and  female  human  pass  along  genetic  traits 
to  their  children  via  evolution,  chromosomes  within  genetic  algorithms  will  pass  along 
specific  traits  to  future  generations  based  upon  crossover  and  mutation. 

1)  Crossover.  Crossover  merges  two  chromosomes  from  the 
current  generation.  This  will  take  two  separate  chromosomes  and  merge  parts  of  each  to 
create  a  new  chromosome  (child).  De  Jong  (1975)  has  shown  through  empirical  studies 
that  that  better  results  are  ultimately  achieved  when  crossover  probability  is  between  0.65 
and  0.85.  In  other  words,  the  probability  that  a  chromosome  will  continue  to  the  next 
generation  is  between  0.35  and  0.15.  The  majority  of  chromosomes  will  end  up  with  a 
crossover  while  only  a  minority  (0.15  -  0.35)  will  move  to  the  next  generation 
unchanged.  Standard  crossover  is  one-point  crossover  where  two  selected  parents  would 
crossover  at  a  randomly  selected  point: 

Parent  1 :  Xi,  X2,  X3,  X4,  X5 
Parent  2:  Y1;  Y2,  Y3,  Y4,  Y5 
Hypothetical  crossover  at  point  3 
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Child  1:  X1;  X2,  X3,  Y4,  Y5 
Child  2:  Y,,  Y2,  Y3,  X4,  X5 

This  shows  the  two  children  with  the  fourth  and  fifth  positions  crossed  over  from  their 
parent’s  original  chromosome  set. 

2)  Mutation.  Mutation  signifies  the  computer  modifying  a 
chromosome.  Just  by  switching  a  single  gene  in  a  chromosome  on  or  off,  a  new 
chromosome  will  be  formed.  If  both  parents  had  position  two  set  off,  then  all  children 
would  have  this  same  off  position  for  position  two.  Mutation  prevents  this  from 
occurring.  Mutation  will  randomly  switch  genes  from  off  to  on  and  vice  versa.  This  is 
generally  done  infrequently  (1  in  1000). 
f  Evolution 

A  new  set  of  chromosomes  is  ready  to  be  analyzed  based  upon  crossover 
and  mutation.  This  new  set  of  chromosomes  has  evolved  and  can  be  fed  into  the 
probabilistic  rules  again  to  create  a  new  generation  of  chromosomes.  The  same 
probabilistic  rules  will  be  utilized  during  this  stage  of  the  process  as  previously  utilized. 
At  this  point,  the  new  generation  must  be  reevaluated  to  determine  whether  the  new 
generation  is  more  optimal  than  the  previous. 
g.  Termination 

Eventually  the  genetic  algorithm  will  terminate.  There  are  two  ways  that 
the  algorithms  will  stop  producing  future  generations.  During  the  initialization 
procedures,  the  analyst  will  set  a  pre-determined  number  of  generations  to  process.  The 
genetic  algorithm  will  process  the  set  number  of  generations  then  terminate  with  a  result. 
The  other  way  the  process  will  tenninate  is  based  upon  the  level  of  improvement  from 
one  generation  of  chromosomes  to  the  next.  If  there  is  no  statistically  significant 
improvement  between  subsequent  generations,  the  genetic  algorithm  will  tenninate. 


C.  OPTIMIZING  SITE  SELECTION  UTILIZING  GENETIC  ALGORITHMS 

Optimization  can  lead  to  efficiencies.  In  other  words,  an  optimized  organization 
is  efficient.  Many  companies  understand  that  site  location  is  a  part  of  this  optimization 
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process.  This  chapter  has  focused  on  providing  a  means  to  optimize  the  real  estate  site 
selection  process.  Site  selection  models  can  be  fed  into  genetic  algorithms  for 
optimization.  The  type  of  model  utilized  is  practically  irrelevant  with  optimization. 
Taking  a  valid  model  output  and  imputing  it  into  an  artificial  intelligent  algorithm  for 
optimization  is  the  key.  The  use  of  artificial  intelligent  algorithms,  specifically  genetic 
algorithms,  can  provide  an  improved  decision  aid  in  the  real  estate  site  selection  process. 
The  following  example  will  provide  a  basis  for  how  this  optimization  would  affect 
possible  decisions. 

1.  Sample  Genetic  Algorithm 

A  simple  example  (Hurley,  Moutinho,  &  Stephens,  1995)  of  how  genetic 

algorithms  work  is  presented  to  show  the  artificial  intelligent  evaluation  process  in 

practice.  For  this  five  potential  chromosomes  will  be  initially  evaluated  (initialization). 

Ci  1010011 

C2  0011110 

C3  1010001 

C4  1111000 

C5  1110101 

There  will  be  four  existing  sites  and  3  potential  sites  to  be  evaluated.  Each  of  the  genes 
in  the  chromosomes  represents  whether  a  site  would  be  open  or  closed.  The  first  four 
genes  from  left  to  right  represent  the  four  existing  stores.  The  last  three  represent  the 
potential  new  stores.  C4  shows  that  all  four  existing  stores  would  remain  open;  while  the 
three  new  stores  would  not  open. 

For  this  hypothetical  example,  the  analog  model  will  be  utilized  for  the 
probabilistic  rules.  Each  chromosome  would  be  run  through  the  analog  model  to 
ascertain  its  fitness  level.  The  fitness  level  would  be  the  total  revenue  expected  for  a 
market  based  upon  the  assigned  open  and  closed  stores.  Each  gene  in  the  chromosome 
would  be  defined  by  the  revenue  for  an  analog  of  similar  sites  in  the  company’s 
databases.  For  C2  this  would  mean  that  the  revenues  approximated  for  the  four  sites  1,  2, 
and  7  being  closed  while  3,  4,  5,  and  6  were  open.  The  hypothetical  outputs  are  as 
follows: 


53 


Cl  1010011  fitness  =  10,542 

C2  0011110  fitness  =  12,321 

C3  1010001  fitness  =  13,222 

C4  1111000  fitness  =  11,214 

C5  1110101  fitness  =  10,499 

Since  chromosomes  Cf  and  C3  have  the  highest  fitness,  they  would  be  the  most  likely 

subjects  for  crossover  and  mutation.  If  one-point  crossover  randomly  was  chosen  at  point 

5,  the  following  children  would  be  produced: 

C2  0011001 

C3  1010110 

These  two  children  chromosomes  would  then  be  put  through  the  analog  model 

(probabilistic  rules)  to  come  up  with  the  following  hypothetical  outputs: 

Co  0011001  fitness  =  11,229 

C3  1010110  fitness  =  14,017 

This  shows  that  there  is  now  a  network  of  sites,  represented  by  C3,  which  is  theoretically 
better  than  all  previous  chromosomes  in  the  population.  This  does  not  take  into  account 
any  mutation  at  this  point.  Mutation  could  possibly  bring  about  “fitter”  solutions. 

This  process  will  be  run  multiple  times  as  dictated  by  the  analyst  until  optimum 
optima  are  produced.  The  computational  power  required  for  this  analysis  can  be  rather 
cumbersome.  As  computer  power  increases  and  comes  more  readily  available,  this  type 
of  analysis  will  also  become  easier  to  conduct. 

2.  Example  of  Genetic  Algorithms  in  Practice 

Genetic  algorithms  have  previously  been  used  in  operation,  although  many  times 
genetic  algorithms  require  more  computing  power  than  is  available  for  computation.  As 
the  price  of  computing  speeds  decrease,  the  potential  use  of  genetic  algorithms  for 
optimization  in  real  estate  site  selection  can  increase.  Few  practical  real  estate  site 
selection  examples  are  available. 

The  following  example  provided  by  Felicity  George  (1994)  shows  the  advantages 
of  using  genetic  algorithms  versus  another  heuristic  optimization  technique  for 
optimizing  networks  of  car  dealerships  in  England.  GMAP  Ltd.’s  heuristical 
optimization  model  called  the  idealized  representation  plan  (IRP)  and  genetic  algorithms 
were  both  used  to  optimize  a  network  of  car  dealerships.  Spatial  interaction  models  acted 
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as  probabilistic  rules  within  each  optimization.  Ultimately  the  fitness  of  each  technique 
was  compared  after  a  specified  number  of  generational  computations. 

The  car  company  had  previously  utilized  the  IRP  model  to  optimize  its  network  of 
dealerships.  The  IRP  model  is  another  heuristic  algorithm.  This  algorithm  selects  data 
points  that  are  profitable  but  may  not  combine  together  to  produce  an  optimal  network. 
IRP  optimization  work  best  when  taking  a  larger  set  of  locations  in  order  to  provide  a 
great  set  of  combinations  to  analyze.  The  ultimate  fitness  of  generations  would  be  the 
ultimate  guide  to  determine  if  there  was  a  clear  advantage  to  genetic  algorithms  over 
other  techniques.  The  greater  the  fitness  of  a  generation  would  imply  a  greater  profit 
from  the  optimized  dealership  network.  In  other  words,  both  techniques  would  produce 
multiple  generations.  George  conducted  this  experiment  with  various  termination  rules. 
Networks  of  dealerships  with  10,  50,  and  75  locations  were  utilized  to  see  if  there  was  a 
noticeable  difference  in  overall  fitness. 

Figure  13  shows  the  optimization  outputs  of  the  three  runs.  For  dealership 
networks  of  10  and  50  locations,  a  genetic  algorithm  optimization  provides  a  higher 
fitness  than  the  IRP  and  random  search  techniques.  When  George  examined  only  a  10 
dealership  network,  genetic  algorithms  consistently  outperformed  other  techniques  after 
30  generations.  This  increased  up  to  about  110  generations  where  an  apparent  maximum 
fitness  is  reached.  When  George  increased  the  dealership  network  size  to  50  dealers, 
genetic  algorithms  outperfonned  the  other  techniques  only  when  RRR  crossover  was 
utilized.  RRR  crossover  is  a  type  of  genetic  crossover  utilized  within  the  genetic 
algorithm  operators. 

The  optimization  with  75  dealership  locations  showed  a  different  output.  The  IRP 
solution  provided  a  greater  fitness  after  850  generations  (based  upon  termination  rules). 
However,  these  results  should  be  examined  more  closely.  George  stated  that  time  was 
too  limited  to  run  the  genetic  algorithms  beyond  850  generations  stating  that  it  took  over 
17  hours  to  complete  850  generations  on  the  computing  machines  in  1994.  Although 
genetic  algorithms  did  not  outperform  the  IRP  after  850  generations,  they  were  still 
progressing  at  a  increasing  rate.  Eventually,  if  a  greater  time  for  termination  was  set, 

genetic  algorithms  could  possibly  out-perfonn  IRP  solutions  for  greater  networks. 
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Source:  Felicity  George’s  Hybrid  Genetic  Algorithms  with  Immunisation  to  Optimise  Networks  of  Car  Dealerships 

from  the  Edinburgh  Parallel  Computing  Centre,  1994. 

Figure  13.  Networks  of  10,  50  and  75  dealerships 


The  difference  in  fitness  within  each  graph  could  represent  profit  for  the 
company.  If  the  automotive  company  had  used  a  random  search  or  IRP  technique  for 
optimization,  they  could  have  missed  $500,000  or  $200,000  respectively  depending  on 
which  technique  they  could  utilized  for  a  network  of  10  dealerships.  The  missed  profit  is 
noticeable  for  each  scenario.  However,  increased  computer  speed  that  was  not  available 
in  1994  is  available  today  for  the  average  real  estate  analyst.  Scenarios  with  over  75 
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dealerships  can  be  examined  today  with  much  quicker  speeds.  Utilizing  the  optimal  site 
selection  optimization  technique  is  a  means  to  help  identify  the  optimal  network  of  sites 
for  a  company  in  a  study  area. 

3.  Optimizing  Real  Estate  Site  Selection  in  Practice 

Optimizing  the  real  estate  site  selection  process  aims  to  accomplish  two  tasks: 
maximize  company  profit  in  a  specified  market  and  eliminate  opening  “dogs”.  Dogs 
represent  locations  that  are  not  profitable  or  feasible  for  an  organization.  “Dog”  stores 
generally  will  be  closed  in  the  future.  A  company  would  want  to  avoid  “dog”  locations. 
The  costs  to  acquire  a  location,  build  the  site,  train  personnel,  and  supply  product  can  be 
large.  Closing  a  location  requires  even  more  money  and  can  lead  to  huge  losses  in  a 
market.  Avoiding  “dog”  locations  can  be  very  desirable  to  many  companies. 

Optimizing  real  estate  site  selection  through  the  use  of  genetic  algorithms  presents 
value  to  companies  looking  to  optimize  their  network  of  locations.  As  discussed  earlier, 
companies  can  help  maximize  their  profits  through  genetic  algorithms.  Companies  can 
also  help  to  eliminate  opening  “dog”  locations.  However,  how  much  is  this  optimization 
of  profit  worth  to  a  company?  This  is  the  question  that  drives  the  value  of  this 
optimization  within  a  company. 

The  concept  of  utilizing  AI  algorithms  to  optimize  the  real  estate  site  selection 
process  is  new.  Very  few  companies  are  using  this  technique.  Training  personnel  on  AI 
intelligent  algorithms  as  a  means  of  optimization  may  not  be  the  best  avenue  for 
companies  to  gain  this  real  estate  advantage.  Hiring  third-party  consultants  may  be  a 
better  means  to  obtain  this  service. 

D.  CHAPTER  SUMMARY 

As  competition  increases  in  markets,  efficiencies  will  need  to  be  made  throughout 
all  levels  of  organizations.  Site  locations  will  need  to  be  optimized.  Gaining  an  edge 
through  optimized  site  locations  can  become  a  new  market  in  itself.  This  chapter 
provided  an  insight  into  an  optimization  technique  for  real  estate  site  selection.  There  are 
many  techniques  available  to  optimize  results;  however  only  artificial  intelligence  was 
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discussed.  Specifically  genetic  algorithms  as  a  form  of  artificial  intelligence  were  shown 
to  be  the  most  effective  form  of  optimization.  Genetic  algorithms  provided  the  most 
comprehensive  form  of  artificial  intelligence  requiring  the  least  amount  of  time  for 
results.  As  the  cost  of  computing  decreases,  genetic  algorithms  will  be  even  easier  to 
utilize.  More  and  more  companies  and  consulting  firms  will  be  able  to  utilize  this  form 
of  optimization  as  this  concept  gains  acceptance. 
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V.  SUMMARY  AND  RECOMMENDATIONS 


A.  SUMMARY 

The  goals  of  this  project,  as  stated  in  the  introduction,  were  to  answer  the 
following  questions: 

1.  Primary  Questions 

•  Is  there  a  current  need  for  market  planning  modeling?  If  so,  why? 

•  To  what  extent  do  artificial  intelligent  algorithms  improve  the  market 
planning  process? 

•  Can  artificial  intelligent  algorithms  be  applied  successfully  to  market 
planning? 

2.  Secondary  Question 

•  What  are  the  limitations  to  current  market  planning  models? 

These  questions  were  answered  in  this  project  providing  insight  into  real  estate  site 
selection  techniques  and  ways  to  optimize  selection  through  artificial  intelligence. 

Market  planning  modeling  is  currently  needed.  Markets  are  becoming  more 
complex  with  new  competitors  emerging.  To  stay  competitive  in  the  market,  companies 
can  utilize  computer  models  such  as  analog,  regression,  and  spatial  interaction  models. 
These  will  help  companies  manage  multiple  parameters  while  giving  them  insight 
previously  unavailable  through  the  checklist  method. 

Adding  artificial  intelligent  algorithms,  specifically  genetic  algorithms,  to  a 
company’s  real  estate  site  selection  process  makes  sense.  Optimizing  location  analysis 
enables  a  company  to  gain  efficiencies  in  site  locations,  hence  providing  a  competitive 
edge.  Saving  money  can  generate  increased  profit  even  without  increased  revenue.  Also, 
potentially  avoiding  the  opening  of  bad  site  locations  is  a  huge  savings.  The  small 
investment  in  optimized  site  location  analysis  can  save  money  in  the  future. 

Genetic  algorithms  have  been  applied  in  market  planning.  Unfortunately,  only 

theoretical  data  is  provided  to  show  quantifiable  improvements  in  optimization.  However 
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these  improvements  should  increase  the  competitive  edge  for  companies  utilizing  their 
benefits.  The  market  planning  example  shown  in  Colorado  did  not  account  for  all 
possible  parameters  in  that  market.  Possible  over-estimates  were  created  in  their  analysis 
that  could  provide  bad  data  to  be  carried  forward  in  future  analyses.  An  optimization  of 
the  model  data  would  have  helped  to  minimize  bad  data. 

Current  market  planning  models  do  require  some  technical  knowledge  and 
computing  power.  Unfortunately,  not  many  companies  employ  personnel  capable  of 
creating  these  models.  Also,  companies  may  not  have  the  requisite  computing  power 
needed  to  run  these  models  and  optimization  techniques. 

One  method  that  small  and  medium  sized  companies  could  employ  to  gain  the 
same  efficiencies  in  real  estate  site  selection  as  large  companies  is  to  hire  third-party 
consultants.  Outsourcing  real  estate  site  selection  or  the  optimization  of  the  process 
could  prove  to  be  more  cost  effective.  This  would  allow  companies  to  focus  on  their  core 
competencies  with  their  existing  workforce. 


B.  APPLICATION  TO  MILITARY  RETAIL  FACILITIES 

Current  military  retail  facilities  are  aligned  with  military  bases.  Size  and 
allocation  are  dependent  upon  base  size  and  location.  The  Navy  Exchange  Command 
could  benefit  from  this  research  by  utilizing  site  selection  models  and  optimization 
techniques.  Data  could  easily  be  gathered  from  POS  transactions  at  each  facility. 

POS  data  could  be  fed  into  a  spatial  interaction  model  utilizing  standard  data. 
This  would  enable  the  Navy  Exchange  Command  to  see  how  their  shoppers  are  truly 
interacting  with  the  military  facility  as  opposed  to  commercial  stores.  Optimizing  model 
output  through  a  genetic  algorithm  would  allow  the  Navy  to  optimize  store  allocation, 
location,  size,  and  sales.  In  a  time  where  the  United  States  military  is  seeking  efficiencies 
in  all  areas,  this  could  prove  useful. 
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C.  LIMITATIONS  TO  THIS  WORK 

1.  Market  Research 

Further  in-depth  market  research  could  be  conducted  in  order  to  gain  a  broader 
sense  of  current  real  estate  site  selection  processes.  This  project  solely  analyzed  a  few 
companies  in  various  industries.  More  research  within  each  industry  would  provide  a 
better  understanding  of  industry-specific  issues  that  are  not  fully  addressed  within  this 
project. 

2.  Optimization  Complexities 

Optimization  via  artificial  intelligence  is  a  complex  and  evolving  field  of  study. 
Innovations  and  advances  are  constantly  being  implemented  to  gain  advantages  in 
business.  Due  to  the  fast-paced  culture  of  this  field  and  it’s  evolving  nature,  not  all 
modeling  complexities  were  mentioned. 


D.  SUGGESTIONS  FOR  FUTURE  RESEARCH 

This  project  looked  solely  at  current  real  estate  site  selection  techniques  and  a 
possible  method  of  improved  optimization — artificial  intelligent  algorithms.  A  more  in- 
depth  analysis  of  other  industries  may  provide  information  on  the  overall  effectiveness  of 
genetic  algorithms.  Specifically  an  artificial  intelligent  application  could  be  developed  to 
test  the  validity  of  this  approach  to  optimization.  This  application  could  be  compared  to 
initial  non-optimized  model  output  and  other  optimization  techniques.  This  comparison 
would  be  able  to  provide  a  true  quantitative  analysis  of  the  various  site  selection  and 
optimization  techniques. 
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